Nasa - troyp/edbrowse GitHub Wiki

Debugging javascript is a major endeavor, as I first discovered when debugging nasa.gov, which came up empty. It has megabytes of js, and we're not really sure why, since it is a readonly site that merely disseminates information. You don't buy things, you don't check your bank account, you can't even log in. As Avril Lavigne says, "Why'd you have to go and make things so complicated?" Most informational sites, like newspapers, can be read with js disabled and they're fine, but some, like nasa, come up empty without js enabled. Even with js on, it still came up empty, thus there was a problem. I spent many days finding and fixing the bug, commit 5bd24b62c5e36dc19ec62b284bea77affe5f75dd, but in doing so, I realized we need better debugging tools, including tracing, debug prints, and breakpoints.

First, we have the javascript debugger, which you enter via the jdb command. It is basically a javascript shell inside edbrowse. . or bye to exit the shell and return to edbrowse proper. A few edbrowse commands are interpreted here, such as db3 for debug level, and e7 to go to another session, and shell escape, etc. Javascript expressions are evaluated, and the document objects are available. document.head is the head of your document <head>, document.body is the body <body>, document.body.firstChild is the first node under <body>, and so on. You want at least db3 to show you the errors as you enter expressions. At db2, you can type the syntactically incorrect line

7 7

and you are greeted with unhelpful silence. At db3 or above you get

jdb line 1: Syntax Error: unterminated statement

If o is an object, ok(o) shows you the object keys, at least those that are enumerable. If x is one of the keys, then o.x returns the value of x. If you're not sure what x is, typeof o.x will tell you. Warning: react is a complex js framework used by some websites, and it redefines ok. You can use Object.keys instead, but it's longer to type and isn't quite as powerful. The alias $ok will work, and is just one keystroke longer than ok.

natok is a native version of ok, and it gives all the properties under an object.

Finding nodes buried somewhere in the tree can be challenging, but querySelectorAll() can help. This isn't a debugging tool, it is standard in all browsers, but it can be helpful.

a = querySelectorAll("p")    list all the paragraphs in the document

a = querySelectorAll("input")    list all the input elements in the document

a = querySelectorAll("p.instruction")    list all the paragraphs with class=instruction

a = querySelectorAll("#login")    list the node whose id=login

There should only be one node with a given id, so instead of an array, you can fetch that particular node directly.

a = querySelector("#login")    returns a node not an array of nodes
a = querySelectorAll("#login")[0]    equivalent

Once you have found a node, use dumptree(n) to see the descendants of a node, and uptrace(n) for the ancestors. The latter shows the class and id of each node, as they are very important in html, javascript, and css.

Scripts are of particular importance, so there is a special function showscripts() to gather all the scripts, even those that were dynamically created, and put them into a special array $ss. You will see the length of each script, and where it comes from (url), and whether it was deminimized. (More on deminimization below.) Use ok($ss[3]) to find out what we know about script number 3. src holds the url, if it came from another web page, and data holds the actual data. Since > is an operator in javascript, I use ^> to write to a file. Thus you can save a copy of the script to disk like this.

\$ss[3].data ^&gt; script3.js

There may be dozens of scripts, so the searchscripts() function helps you find the script that contains a particular string.

searchscripts("foobar")

Since the css files are more static, they are already gathered into an array; you don't need to issue a command to do it. cssSource is an array of objects, similar to $ss, but each object describes a css file, rather than a javascript file. src holds the url of the css file, and data holds the actual data. The url is typically the href attribute in the <link> tag, and the data is the contents of that web page.

<link href="https://something.com/foobar.min.css"  rel="stylesheet" type="text/css">

$pjobs is an array that holds the Promise jobs, at level 3 or above. Thus you can see what functions were executed by Promise.

As you browse and unbrowse, and fill out forms, and push buttons, and move in and out of jdb, there are debug flags beyond the numeric debug level. These are toggle flags.

dbev debugs the events and the listeners for those events. You will see when listeners are added and removed, when events are dispatched, and when handlers are called. You want to be familiar with the edbrowse convention for event handlers. Node.onclick is the function to run when a node captures or bubbles the onclick event, and that is standard javascript, we can't change that. So we put in our own onclick() function that calls all the handlers that have been added by addEventListener(). These are stored in Node.onclick$$array. If there was an onclick function originally, either from an html tag setting the onclick attribute, or because javascript added an onclick function to the Node, and then more handlers are added via addEventListener(), the original handler is stored in Node.onclick$$orig. So our onclick function checks for onclick$$orig first and runs that, and if it does not fail, it steps through the handlers in onclick$$array and runs those in turn as long as they don't fail. The same pattern applies for onload, oninput, onchange, and all the other events. Of course things like onmouseout and onfocus and such we ignore, because edbrowse has no such events.

dbcss debugs css. There can be thousands of selectors and rules; the results are stored in /tmp/css.

dberr shows you all the errors, even those caught in a try catch block. Many of these errors are ok, as far as the website is concerned, that's why they are in try blocks; so turning this on can be more distracting than helpful. Note - this doesn't work as of our switch to quick js.

timers disables javascript timers, and you often want to do this so that things are frozen while you debug.

Now, before we can trace execution or add breakpoints, we need a local copy of the website. I'll illustrate with nasa.gov.

make a directory somewhere called nasa and cd into it. This directory is empty.
Call up edbrowse and set demin for deminimization, and trace to add trace points, and db3 if you want to see what is going on.

demin trace db3 b https://www.nasa.gov
Verify you have about 100 lines, you got real stuff. Try to ignore the errors for now.
jump into jdb and run snapshot(). This is a special function that creates a local copy of the website as best it can. It prints a reminder of what you need to do next.
Use . to exit out of jdb, then ub to unbrowse. Read the file called from into the current html file just after the <head> tag. You may need to cut the first few tags apart if they are all on one line, so you can place the Base tag just after the Head tag and before other tags. The from file will look like this.

This sets the base tag so the local website looks like it came from the server at nasa.gov. Finally, save this to a local file called base, and then quit.

ls to see what has happened. There are lots of new files.
edbrowse the base file, just a local file. db3 to see what is happening, and b to browse. Almost all of the javascript files, and certainly the ones we care about, are fetched locally from your directory. Also the css files. You should get the same errors, and the same 100 lines of stuff.
You can change the js files, add alert statements, breakpoints, etc, and browse and change and browse again, and otherwise debug.

For ongoing work, you can change the filenames to be more intuitive. Perhaps f2.js is vendor.js and f3.js is nasa.js. Just make the same change in the jslocal map.

The js files are deminimized, with lots of trace points. A trace point looks like trace@(d221). The global variable step$l controls the trace. If step$l = 0 then the javascript runs as before, though much slower. If step$l = 1 then each trace label is printed as execution procedes. Combine this with db3 to see where errors are happening. If step$l = 2 then edbrowse stops at each trace point, as though it were a breakpoint. The breakpoint shell is similar to jdb but much more restricted. You can't issue any edbrowse commands at all, since javascript is actively running here. You can view and even modify the local variables that are in scope, or the global variables. The object known as this is preserved. The special symbol arg$ is the arguments object of the currently running function. Type . to exit and resume execution. You can change step$l to change tracing.

You can add your own breakpoint with the bp@ macro, i.e. bp@(label). Edbrowse will say break at line label, then you can look around. Again, type . to exit.

I worked on one site that added its own toString() functions to various prototypes, and these in turn had trace points, so when I asked for the value of x, and it tried to turn x into a string, it entered the toString() function associated with x, which triggered more breakpoints, which was really really confusing. I entered . . . and finally got back to the string value of x. I hope this is a rare occurrence.

At the beginning of youre base file, you can add <script> step$go = 'd221'; </script>, and step$l will change to 2 when execution reaches trace@(d221). Edbrowse breaks at every trace point thereafter, until you downgrade step$l. Another trigger mechanism is <script> step$exp = "foobar==27"; </script>. Your expression will be evaluated everywhere, so make sure it refers to global variables, or nodes that are uniquely identified in the dom tree, by using querySelector("#something") for example.

Have fun.