Preliminary Survey Results - neuron-team/vscode-ipe GitHub Wiki

'Proof of Concept' Prototype - User Experience Survey

Goal

The UX concepts elaborated for a data scientist workflow in VS Code were validated though user reviews. This was done to usefully enhance the integration between the rich editor experience in VS Code and the interactive programming model of Jupyter Notebooks, as the UX experience is neither a pure editor experience nor a pure Jupyter experience. Feedback was collected from customers to conclude whether this is an experience that solves real problems that they have in their jobs. The responses obtained can be seen below.

Venkata Vemulapalli, Alaska Airlines

Top Takeaways

He liked the concept – he was able to quickly understand its relationship to Jupyter Notebooks. In fact, he wanted it to be more like Jupyter Notebooks.
He is training to become a data scientist – we should keep an eye out for developers turned data scientists to see if there are opportunities to build tools that are more natural and familiar to data scientists with a developer background.

Hypotheses Validated:

  Interactively running a selection of code is intuitive - YES

  Interactively running the current line of code is intuitive - YES

  Popping the output cell into its own browser tab is intuitive - YES

  Popping the output cell into its own browser tab is useful - YES

  Rearranging the output cell using the arrow buttons is intuitive - YES

  Deleting the output cell using the x button is intuitive - YES

  Rearranging / cleaning up the output window contents is useful - YES

  Export to Jupyter function is useful - YES

  This is an interactive experience that the user wants - YES

Workflow Questions:

1. Can you describe what your role is in your day job?

Works as a developer at Alaska Airlines
Primarily data engineering tasks – working with Spark and Cosmos DB, Azure Data Lake
Working on use case to get on-premise implementations of these setup
Early adopter of Azure ML Workbench
Taking a data science certification at UW

2. What programming languages do you use for Data Science?

Uses Python as his primary data engineering language, and in his data science course at UW

3. What tools do you use in your Python work?

Jupyter
Converts his Jupyter notebooks to Python scripts to run and uses PyCharm for that
Azure ML Workbench

4. Looking at the concept, how would you execute the first line of code?

Highlight and run

5. Show what the concept looks like with the window showing up in the right hand pane. What do you think of this solution?

He likes this approach over Jupyter, since everything needs to be separated into cells to run – here he can select just what he wants to run

6. Now imagine that you have many output cells created on the right hand pane. How would you figure out what code was run to generate the output?

Click on the chevron

7. How imagine running the plotly code on left. How would you execute lines 3-11 in the editor pane?

Select and run

8. Show what the concept looks like after running the plotly code. What do you think of this solution?

Feels it is intuitive
Suggestion: would like to have an arrow to the left to move code back into the editor window

9. Now imagine that you wanted to compare two output cells in the right hand window, but they were separated by quite a vertical distance (so you would need to scroll up and down to compare)

I led the witness here by asking what he thought the arrow to the top right did, but the result wasn’t surprising to him

10. After doing work for a while, the right hand side now has a lot of output. How would you clean things up to tell a better story?

“x” button to delete cells
I led the witness here by asking him what he thought the up and down arrows did

11. Now that you have the right hand window cleaned up, how would you share your work with others?

Export to Jupyter button – felt it would give him the best of the Sublime experience with Jupyter

Solution Feedback Questions:

1. What are your impressions of this concept?

Likes the concept
Plotting information is more useful
Did not know that VS Code had Python support
Suggestion: he would like an option to have inputs and outputs on the same screen (much like a pure Jupyter experience)

2. Who else is this concept relevant to? (Maybe they’re not the best user for it, but their teammate is.)

Developers
Business Analysts have basic knowledge of Python
Don't have access to build these kinds of reports with knowledge of Python

3. What products, services, or technologies do you use in place of this concept?

Writing Python scripts in Jupyter notebook
Could have used this today

4. What do the competitor offerings for this concept do a particularly good/poor job at?

Didn’t know

5. Other comments that could be incorporated into future questions:

Tak Mack, Alaska Airlines

Top Takeaways

He likes the concept because it lets him understand his code better. It lets him see all of his code vs. the Jupyter approach where all of the code is “in chunks”.
He uses Jupyter for exploration and collaboration. He thinks that the export to Jupyter option will make it easier to collaborate with other people.
He views SQL as his primary language for data science, because that is how you get the data. Python is used for analysis.

Hypotheses Validated:

  Interactively running a selection of code is intuitive - YES

  Interactively running the current line of code is intuitive - YES

  Popping the output cell into its own browser tab is intuitive - YES

  Popping the output cell into its own browser tab is useful - YES

  Rearranging the output cell using the arrow buttons is intuitive - YES

  Deleting the output cell using the x button is intuitive - YES

  Rearranging / cleaning up the output window contents is useful - YES

  Export to Jupyter function is useful - YES

  This is an interactive experience that the user wants - YES

Workflow Questions:

1. Can you describe what your role is in your day job?

Tak works at Alaska Airlines. He came over as part of the Virgin merger. He works as a Data Scientist, and he’s leading the transition to the cloud for his team.
In his IC work, he’s mostly managing projects and getting data “through an FAA service”. Previously he worked at textbook rental company Cheqq, creating demand forecasting models.
An example modeling scenario they recently were working on was figuring out how many bags they should ticket at the gate so they don’t overflow the plane’s capacity
The models that they create are mostly classic models (e.g., random forest) but they also use neural nets as well
He thinks that it’s easy to switch between different model algorithms (“everything is just a package”)
When asked about what he prefers, he echoed the concern neural nets are hard to explain and that classic models are easier to explain to stakeholders
He likes Jupyter for its ability to experiment / explore, but doesn’t like it because it’s harder to understand code as a sequence of chunks
They use DSVM today for some of their scenarios. He would like to use Azure Notebooks to access databases. Others on his team have tried / used Azure Notebooks.
Database access is key – they have both a SQL Server and an Informix(!) database that they access from Azure via Express Route
They are also building out an Azure Data Lake solution today as well.

2. What programming languages do you use for Data Science?

He says that SQL is his primary language (that is how you get the data).
They use Python for analysis, even though he prefers R. He sees that R usage is increasing over time on his team.

3. What tools do you use in your Python work?

Uses Jupyter notebooks for testing code “chunk by chunk”
He doesn’t use Python everyday so appreciates the interactive execution mode of Jupyter
Uses Sublime as an editor for his Python work + a console
He uses RStudio for his R work

4. Looking at the concept, how would you execute the first line of code?

Highlight and run

5. Show what the concept looks like with the window showing up in the right hand pane. What do you think of this solution?

He likes this approach over Jupyter, since everything needs to be separated into cells to run – here he can select just what he wants to run

6. Now imagine that you have many output cells created on the right hand pane. How would you figure out what code was run to generate the output?

Click on the chevron

7. How imagine running the plotly code on left. How would you execute lines 3-11 in the editor pane?

Select and run

8. Show what the concept looks like after running the plotly code. What do you think of this solution?

Feels it is intuitive
Suggestion: would like to have an arrow to the left to move code back into the editor window

9. Now imagine that you wanted to compare two output cells in the right hand window, but they were separated by quite a vertical distance (so you would need to scroll up and down to compare)

I led the witness here by asking what he thought the arrow to the top right did, but the result wasn’t surprising to him

10. After doing work for a while, the right hand side now has a lot of output. How would you clean things up to tell a better story?

“x” button to delete cells
I led the witness here by asking him what he thought the up and down rrows did

11. Now that you have the right hand window cleaned up, how would you share your work with others?

Export to Jupyter button – felt it would give him the best of the Sublime experience with Jupyter

Richard Sharp, Starbucks

Top Takeaways

He would definitely try the experience, but he feels it needs to be a significant improvement on how he works today to switch
Has a local / remote workflow, and provided valuable insights into how he does this
Valuable insights into differentiating between exploration in Jupyter and creating an artifact that survives him using PyCharm

Hypotheses Validated:

 Interactively running a selection of code is intuitive - YES

 Interactively running the current line of code is intuitive - YES

 Popping the output cell into its own browser tab is intuitive - YES

 Popping the output cell into its own browser tab is useful - YES

 Rearranging the output cell using the arrow buttons is intuitive - YES

 Deleting the output cell using the x button is intuitive - YES

 Rearranging / cleaning up the output window contents is useful - YES

 Export to Jupyter function is useful - YES

 This is an interactive experience that the user wants - YES

Workflow Questions:

1. Can you describe what your role is in your day job?

Works as a Data Scientist in the marketing department at Starbucks
Direct relationship marketing (i.e., not broadcast) – send digital offers to customers (e.g., Facebook, Starbucks app, web)
Spends more time doing data engineering than data science
Drives creation of KPIs to drive future direction in personalized advertising

2. What programming languages do you use for Data Science?

Python guy at the moment – for last 4 years. Last job was also a Python shop
Works at Starbucks with a bunch of R colleagues, though the predominant language is Python
His R colleagues are classically trained statisticians
His observation is that Python has grown up quickly in the past 5 years and is now mainstream acceptable

3. What tools do you use in your Python work?

Hardware: laptop and 2/3 VMs on a cluster
Jupyter for exploration
Installed on HDI cluster head node
Connect from browser to Jupyter on that HDI head node
Uses it to:
- Test small chunks of code
- Play with Spark
- Visualize data
- Other ad-hoc interactive stuff
- PyCharm for reproducibility
- Goal is to turn results from Jupyter notebook into a single Python file
Likes
- Great editing experience with autocomplete / syntax highlighting / formatting
- Enough organization for code at project level (not large projects)
- Integrates “a bit” with git
- Maintaining configurations in PyCharm isolated environments
  - Scripts
  - Configurations
  - Environments
  - Versions
- Cross platform – though he mostly runs on Windows
- Likes stepping through code in PyCharm would love to have that in Jupyter
- Debugs code by running interactively until hits a breakpoint. Then copies code and re-runs in Jupyter so he can visualize
Uses git for file transfer between client and server
- Develop on laptop / VM
- Pushes to github
- Pulls code from github
- Uses it like FTP
- Make it easy to do this
Suggestion parallel debugging
- Debug math stuff – please integrate Tensorboard
- Debug the CS stuff on cluster- perhaps download locally to visualize?
- Would like to step through code and see histogram of a variable / array / data structure
Drawbacks of PyCharm
- Would love a pattern for develop local and deploy / debug on remote
- Would use a laptop to code against an Azure environment because
  - Data lives there
  - Spark lives there
  - Other dependencies specific to that environment

4. Looking at the concept, how would you execute the first line of code?

Right click on line and hit keyboard shortcut or run command

5. Show what the concept looks like with the window showing up in the right hand pane. What do you think of this solution?

Likes it – is intuitive and what he expected

6. Now imagine that you have many output cells created on the right hand pane. How would you figure out what code was run to generate the output?

Correctly identified the chevron
But also thought that it would provide
- List of what he imported
- List of variables in scope

7. How imagine running the plotly code on left. How would you execute lines 3-11 in the editor pane?

Select code and run

8. Show what the concept looks like after running the plotly code. What do you think of this solution?

Exactly like what he expected

9. Now imagine that you wanted to compare two output cells in the right hand window, but they were separated by quite a vertical distance (so you would need to scroll up and down to compare)

Full disclosure: in this interview, I asked him how he would “pop out” the cell, leading the witness. I have refined this in subsequent interviews and no longer do the leading the of the witness
Identified clicking on the diagonal arrow
Suggestion: make it really easy to send popped out image / cell to his boss

10. After doing work for a while, the right hand side now has a lot of output. How would you clean things up to tell a better story?

Identified using arrow up / down to reorder cells just like in Jupyter
Identified clicking on “x” to remove cells

11. Now that you have the right hand window cleaned up, how would you share your work with others?

Full disclosure: also led witness here, but given that he immediately saw the connection to Jupyter, clicking on

Solution Feedback Questions:

What are your impressions of this concept? (Keep the question open ended and general at first, so they talk about whatever’s top of mind for them after seeing the concept.)

One thing he likes about this solution vs. Jupyter is that everything is not in separate cells. He feels that he can easily just highlight what he wants and run it
Feels that export to Jupyter will make it easier to collaborate with other people
Made the observation that there will be a need to “tidy up” the output (i.e., remove things that are no longer relevant)
He made a suggestion – make it easy to send code from the output back to the input (we didn’t really dig into how to do this, as where the insertion point should be is going to be a problem for most people)
He likes that he can have a clean Python file at the end – by iterating on the code in the editor, he winds up with a file that he can just run to completion.
He made another suggestion while we were talking about interactivity – make it easy to export the current Python environment path – I’ve captured this ask with the VS Code Python team

Who else is this concept relevant to? (Maybe they’re not the best user for it, but their teammate is.)

Data science colleagues
Decision scientist - data analytics vs. data science
- Data science creates models
- Data analytics - report is the result
- There are R users there - they would like R
- Also pure SQL people as well
Suggestion: look at DataBricks - click here to plot demo
Starbucks - need pure SQL support

What products, services, or technologies do you use in place of this concept?

PyCharm, Jupyter, Git

If this concept were available, how would you use it?

Prototype interactively, and turn the interactive code into a Python script

What would keep you from using it?

What is Microsoft locking me into? We had a discussion about this, and he realizes that there is no real lock-in here, but the concern is still there
If he couldn't bring his own librarires-and use pip env / virtual environments to manage environments.
If hard to install packages etc. or ard to switch
Communicate with HDI cluster / Spark is needed

Rob Ringham, Avanade

Top Takeaways

He wants to use this concept right away, and for the most part the concept was intuitive
He prefers IDEs, and he sees this as a great step towards giving him the best of an IDE and a Jupyter experience.
There is a small amount of confusion around whether the right-hand window is fully interactive or not. That was removed when I described it as a History window. Perhaps calling it that might remove the confusion.

Hypotheses Validated:

  Interactively running a selection of code is intuitive - YES

  Interactively running the current line of code is intuitive - YES

  Popping the output cell into its own browser tab is intuitive - YES

  Popping the output cell into its own browser tab is useful - YES

  Rearranging the output cell using the arrow buttons is intuitive - NO

  Deleting the output cell using the x button is intuitive - YES

  Rearranging / cleaning up the output window contents is useful - YES

  Export to Jupyter function is useful - YES

  This is an interactive experience that the user wants - YES

Workflow Questions:

1. Can you describe what your role is in your day job? Works on the Avanade AI incubation team

Specializes in the MSFT AI stack – he uses: cognitive services, toolkit, pytorch / tensorflow is key, azure ml, azure data lake
Focused on DL and computer vision, NLP now as well (via LSTMs with Tensorflow/CNTK)
Workflow is mostly done through AML workbench with Jupyter notebook training with DSVMs with GPUs behind them
Thinks of himself as an applied Data Scientist
Wasn't classically trained as a DS or statistics
Has learned on his own traditional AI as well as software engineering / architecture
One task is to turn research papers into ideas for clients. We talked about Karpathy's unreasonable effectiveness of RNNs as an example of this

2. What programming languages do you use for Data Science?

Python for DS
Did R, but no R in last year and a half
Dev work: Has built a bot in C# that calls his Python web services
Dev work: Models deployed behind flask on Ubuntu + nginx

3. What tools do you use in your Python work?

Uses a Jupyter notebook for exploration / training
Uses VS Code for editing Python files
Hasn’t used VS in a long time
His desktop right now: 3 windows open
- Jupyter notebook where he is training a model from a CSV file
- VS Code window with a local Python Flask service that sits in front of the model
- VS Code window that has a Python script that “benchmarks” (sees how accurate the model is as opposed to performance) his model
Unprompted: hoping that we are going to discuss Jupyter in VS Code today

4. Looking at the concept, how would you execute the first line of code?

Right click on line and run it via either a context menu or a keyboard shortcut
He thinks that this line could be like an input cell in a notebook
Thinks the output sandwiched between line 1 and 2, much like what see in Jupyter notebooks

5.Show what the concept looks like with the window showing up in the right hand pane. What do you think of this solution?

He likes it – it reminds him of LightTable
Suggestion: highlight and auto-scroll to the code that generated the output.
- This is something that he asked about a few times – so in his case he is wondering whether there can be an association between the output pane and the code in the editor buffer.
Unprompted: saw the export to Jupyter button and likes that as a way to get a full Jupyter experience
Unprompted: was wondering if we could right click and define markdown cells on the right hand side as well to “add documentation”

6. Now imagine that you have many output cells created on the right hand pane. How would you figure out what code was run to generate the output?

Talked about lines between code cells in right and code in the editor buffer first
Needed prompting to see the chevron that hides / reveals code
Once he saw that, he was comfortable with that as a solution for seeing the code that generated the output

7. How imagine running the plotly code on left. How would you execute lines 3-11 in the editor pane?

He proposed many options:
- Step forward through lines to run it
- Select lines 3-11 and run a keyboard shortcut
- Run until a breakpoint
- Run chunks of code that are separated by blank lines
He talked about his frustration with the lack of debugging support in Jupyter
- Would like to run line by line in addition to chunks
- Most often, he would run chunks of code
- Would like both!

8. Show what the concept looks like after running the plotly code. What do you think of this solution?

Expected to see output in the right window, like before. Liked the experience.
Asked: can you step through the code on the right hand side?
Suggested: Run to the cell to reproduce the output in that cell

9. Now imagine that you wanted to compare two output cells in the right hand window, but they were separated by quite a vertical distance (so you would need to scroll up and down to compare)

First suggested that the arrow pointing up and right could be used to pin a cell in place
When shown that the concept pops out the window, he said he would have liked to see the window inside of the VS Code frame
After he thought about it, he came back with how he felt that this did solve the problem of comparing two different plots
Suggestion: the popout window could be an “accumulator” that would accumulate plots pinned to it. This is an interesting suggestion to consider …

10. After doing work for a while, the right hand side now has a lot of output. How would you clean things up to tell a better story?

Unprompted: thought that the up/down arrows would let him cycle forward and back on the history of a cell. Clearly he’s still thinking that a cell is bound to a specific region of code on the left hand side.
Suggested that “x” could be used to remove a cell.
I explained that the Jupyter analogy, and he realized that the up/down arrows could be used to move the cells around in the history.

11. Now that you have the right hand window cleaned up, how would you share your work with others?

Immediately saw how the Export to Jupyter button would enable this scenario. Was looking forward to this.

Solution Feedback Questions:

1. What are your impressions of this concept? (Keep the question open ended and general at first, so they talk about whatever’s top of mind for them after seeing the concept.)

Loves the concept - will use this today if he could
Editor experience - would like to see an experience from right to left - a line of some sort

2. Who else is this concept relevant to? (Maybe they’re not the best user for it, but their teammate is.)

People who are doing data science
People who are doing applied data science
More tuned towards a notebook experience

3. What products, services, or technologies do you use in place of this concept?

Described his current desktop with 3 windows open:
- is a notebook where he is training a model from a CSV
- is a VS code window with a local python flask service that is in front of the model (thing that gens the CSV)
- is a VS code window that has a Python script that calls the flask service that runs his "benchmark" –
Thinks he can use just this one tool for all of these jobs – can eliminate ALT+TAB.
Suggestion: Wants an experimentation tab (how group foo.py and its output)

4. What do the competitor offerings for this concept do a particularly good/poor job at?

PyCharm was too far from this experience
VS Code is good for many languages vs. PyCharm being focused for just 1 language
Loves the extensions available for VS Code. Example: extension for rainbow columns in CSV files in VS Code - super extensibility
He left Atom and others and went to VS Code
Spends half time working on Windows/Mac - cross platform was key

5. Other comments that could be incorporated into future questions:

Sees the power of the tool because it combines all IDE features, e.g., debugging, linting, IntelliSense, refactoring, search and replace etc. with a notebook story
Suggestion: adding Tensorboard diagnostic tooling would be great as well
Thinks that the figma prototype UX would be better if it showed the typing experience as well