Studio is a debugger for the data produced by complex applications. Studio imports dense and "messy" data in an application's own native formats, then it applies all the tools that are needed to extract useful information, and finally it presents the results in an interactive graphical user interface.
Studio is particularly intended for complex applications with many moving parts: JIT compilers, network stacks, databases, operating system kernels, and so on. These applications are often running in a production environment where they cannot be stopped to perform diagnostics - but they can quickly dump some raw data for offline inspection.
Such applications are not especially well served by the usual ad-hoc scripts based on gdb, graphviz, gnuplot, perl, and so on, but they usually do not justify building and supporting full-scale fancy tools from scratch either. Studio bridges this gap by providing a common foundation that can be reused for new applications.
How does Studio make this easier than doing everything yourself? The key is combining two very powerful tools: Nix for processing data and Pharo for user interaction.
Nix makes it easy to incorporate other tools. This could be as simple as running a custom Python application or running an R script with all of its dependencies loaded. It could also be more complex, like running an ARM virtual machine to process data using proprietary vendor tools as one step in the processing pipeline. Nix provides all the flexibility needed to handle such problems, and it throws in caching and paralleization and distributed builds as a bonus.
Pharo makes it easy to visually inspect object graphs and other complex data structures. Pharo itself is a full-scale modern Smalltalk development environment, the Agile Visualization toolkit makes it easy to map your data onto visual objects that you can manipulate, and the Glamorous Toolkit provides the tools for navigating and searching your data. The visualization could be simple, like rendering JSON as a graph or CSV as a chart, or it could be complex like visualizing high-level application data structures after extracting them from a coredump file using DWARF metadata.
Do you have an application that deserves its own tools? Studio is ready for you to extend. Screenshots galore can be found in the Supported Applications section.
Studio is a GUI application for Linux/x86-64. You can run Studio directly (X11 mode) or remotely (VNC mode.) Both modes are supported "out of the box."
macOS users can use VNC mode to run Studio on a server, a cloud VM, a Docker container, a VirtualBox VM, etc.
The ideal deployment environment is a server with plenty of resources. The Studio backend does extensive parallelization and caching so it is able to make good use of CPU cores, network bandwidth, RAM, disk space, etc.
You can also run Studio on many different machines, at the same time or at different times, because these installations do not store any important state on local storage. Everything is accessed from the network and local storage is only used for caching.
Studio is installed using the Nix package manager. You need to install Nix before installing Studio.
Here is a one-liner for installing Nix:
$ curl https://nixos.org/nix/install | sh
You can install Studio directly from a source tarball. Here is the command to install the current master branch:
$ nix-env -iA studio -f https://github.com/studio/studio/archive/master.tar.gz
Studio can be updated to the latest version by re-running the installation command. The URL can be updated to point to any Studio source archive. Switching back and forth between multiple versions is no problem.
You can run the Studio GUI either locally (X11) or remotely (VNC.) The command studio-x11 runs the GUI directly on your X server while the command studio-vnc creates a VNC desktop running Studio for remote access.
Usage:
$ studio-x11
$ studio-vnc [extra-vncserver-args...]
The VNC server used is tigervnc.
The recommended VNC client is tigervnc which supports automatically resizing the desktop to suit the client window size. On macOS with Homebrew you can install tigervnc with brew cask install tigervnc-viewer and then run vncviewer <server>[:display].
Here is a Studio-over-SSH cheat sheet:
ssh <server> studio-vnc [:display]. If no display is specified then an available one is assigned automatically.ssh -L 5907:localhost:5907 <server>.vncviewer localhost:7.ssh <server> vncserver -kill <:display>.The Studio user interface is based on the Miller columns paradigm. Specifically, Studio uses a Miller columns implementation called GTInspector. In this section we first illustrate Miller columns with a familiar example and then use this to explain the Studio interface.
Let us illustrate the Miller column concept with a widely known example from the macOS Finder.
macOS Finder
The screenshot shows the "Columns" mode of the macOS Finder. We can deconstruct the picture this way:
Studio uses this same paradigm. The user inspects a directed graph of objects, these objects can be of infinitely many diverse types with their own distinct visual presentations, and when a new object is selected in a pane then it is inspected in the next pane immediately on the right.
Here is a screenshot of the Studio UI:
Studio UI
We can deconstruct this in the same way. There are some key similarities:
In Studio each pane presents a tabbed set of named views. Each view presents the same object in a different way.
For example, a JIT trace could be shown as a graphical tree of IR instructions, or as a table of IR instructions and their operands, or as textual disassembled machine code. The most appropriate choice depends on what the user is interested in at a given moment. The interface makes it easy to switch between views with a mouse click on the right tab.
The screenshot shows four side-by-side panes, but the exact number of panes shown at any time is controlled by the user. Studio starts with one pane and then adds a second when the first object is selected. As more objects are selected the UI automatically "scrolls" to the right. This means that by default the user will see the right-most two panes that are deepest in the inspection chain, while the left-most panes further up the chain will have scrolled off screen.
The user can also directly control which panes are visible using this control that is always present at the bottom of the window:
Controls
The circles represent the panes and the darker shaded area indicates which panes are currently visible. These mouse actions are available:
Studio uses the Miller column implementation from the Glamorous Toolkit (GT) Inspector user interface framework. This framework provides the whole basic UI. Studio then extends this framework to support more relevant kinds of objects.
You can find more information about the Glamorous Toolkit in the Mastering Studio section.
Studio is built on three main abstractions: products, builders, and presentations. Products are raw data on disk; builders are scripts that can create products; presentations are graphical views of products.
A product is raw data - a directory on disk - in a well-defined format. Each product has a named type that informally defines the layout of the directory and the characteristics of the files. The type is always specified in a file named .studio/product-info.yaml.
A builder is a script - a Nix derivation - that creates one or more products. The builder takes some inputs - parameters, files, URLs, output of other builders, etc - and uses them to produce a product. Certain builders have simple high-level APIs that are easy for users to call interactively. Other builders have intricate APIs and are used as component parts of higher-level builders.
A builder also takes all of the software that it requires as an input. This is completely natural with Nix. If specific software is needed, in specific versions, from specific Git branches, with specific patches, etc, then it can be provided with Nix. Indeed, most common software packages are already available out of the box from the nixpkgs package repository and can easily have their versions overridden.
A presentation is an interactive user interface - a live Smalltalk object - that presents a product (or a component part of a product) to the user. The input to the presentation is a product stored on the local file system. The presentation code then adds new view tabs to the inspector.
The raptorjit-vm-dump product represents a snapshot of the way a RaptorJIT process has executed.
audit.log is a log file in a RaptorJIT-native format (based on msgpack) that lists all actions the JIT has taken, including trace aborts and trace completions.raptorjit-dwarf.json is a DWARF type information dump that has been converted to JSON format by dwarfish.vmprofile/ is a directory containing zero or more profiler datasets in the RaptorJIT-native vmprofile format.The snapshot may have been taken either during or after execution. (RaptorJIT VM state is always externally available via the filesystem and so it can be "passively" snapshotted by the user at any time, including after the process has terminated.)
Each of these functions creates a raptorjit-vm-dump product that the Studio frontend can inspect:
raptorjit.run <luaSourceString>
Evaluate a string of Lua source.
Returns a raptorjit-vm-dump product.
raptorjit.runDirectory <path>
Evaluate *.lua in <path>.
Returns a raptorjit-vm-dump product.
raptorjit.runTarball <url>
Evaluate *.lua in the contents of the tarball at <url>.
Returns a raptorjit-vm-dump product.
raptorjit.inspect <path>
Process files products by a previous execution of RaptorJIT.
Detects audit.log, raptorjit.dwo, and **/*.vmprofile.
Returns a raptorjit-vm-dump product.
The RaptorJIT Process presentation represents the way a RaptorJIT VM has executed: aborted tracing attempts, successful traces and their generated code, and profiler data.
The Traces Overview view visually shows all of the traces of machine code that have been generated by the JIT.
RaptorJIT-Process-TraceOverview
Outer black boxes represent sets of root traces, i.e. traces that all begin on the same bytecode.
Inner blue boxes represent individual traces. Clicking the box inspects the trace. The area of the box is proportional to the number of IR instructions.
The Trace List view shows all of the traces in a table.
RaptorJIT-Process-TraceList
Clicking on a row inspects the trace.
The VMProfiles view summarizes the profiler datasets that are available. Applications are free to create as many profiles as they like and to switch between them at arbitrary times e.g. during different processing states. Clicking a row inspects the profile.
RaptorJIT-Process-VMProfiles
RaptorJIT-VMProfile-HotTraces
RaptorJIT-Trace-IRTree
RaptorJIT-Trace-IRListing
RaptorJIT-Trace-DWARF
(Not yet written.)
For a running example, let us define a product type called xml/packet-capture/pdml that represents a network packet capture in XML format. Here is how we informally define this product type:
.studio/product-info.yaml will include type: xml/packet-capture/pdml.packets.pdml will exist at the top-level.packets.pdml will contain network packets in the PDML XML format defined by Wireshark.This simple product definition defines the interface between builders, that have to produce directories in this format, and presentatins that will display those directories to the user.
Digression: One can imagine many other product types. For example, a type application/crash-report might represent debug information about an application that has crashed and include the files exe (an ELF executable with debug symbols), core (a core dump at the point of the crash), config.gz (Linux kernel configuration copied from /proc/config.gz), and so on. Such products could serve as intermediate representations from which to derive other products, like high-level summaries or low-level disassembly of the relevant instructions, using tools like gdb and objdump and so on.
For example, let us define a builder that takes for input the URL of a packet capture in binary pcap file format and for output creates a product of type xml/packet-capture/pdml.
# pdml api module
pdml = {
# inspect-url function
inspect-url = pcap-url:
runCommand "pdml-from-pcap-url"
# inputs
{
pcap-file = fetchurl pcap-url;
buildInputs = [ wireshark ];
}
# build script
''
mkdir -p $out/.studio
echo 'type: xml/packet-capture/pdml' > $out/.studio/product-info.yaml
tshark -t pdml -i ${pcap-file} -o $out/packets.pdml
'';
}This builder can be invoked in a script like this:
pdml.inspect-url http://my.site/foo.pcapand it will produce a Studio product as a directory in exactly the expected file format.
Note: We have specified our software dependency simply with the name wireshark. This means that Studio will download and use the default version in the base version of nixpkgs. That is, Studio would always use exactly the same version of wireshark no matter where it is running. If we wanted to use a more specific version, or apply patches to support some new experimental protocols, etc, then this would be straightforward with Nix.
StudioPresentation subclass: #PDLPacketCapturePresentation
instanceVariableNames: 'xml'
classVariableNames: ''
package: 'Studio-UI'
PDLPacketCapturePresentation class >> supportsProductType: type
^ type = 'xml/packet-capture/pdl'.
PDLPacketCapturePresentation >> openOn: dir
xml := XMLDomParser parseFileNamed: dir / 'packets.pdml'.
PDLPacketCapturePresentation >> gtInspectorPacketsIn: composite
<gtInspectorPresentationOrder: 1>
"Reuse the standard XML tree view."
xml gtInspectorTreeIn: composite.
Once we have defined a product type, a builder, and a presentation then we have added a new capability to Studio.
We can run our builder on the URL of a standard Wireshark example trace for a PPP handshake:
pdml.inspect-url https://wiki.wireshark.org/SampleCaptures?action=AttachFile&do=get&target=PPPHandshake.cap
which creates the product for our presenter to show as an XML tree:
XML PDL Tree browser