Productivity tips - leondutoit/data-centric-programming GitHub Wiki
When it comes to teaching people and transferring skills, there are way too many things that experienced programmers tend to take for granted. Of course this is not limited to the field of programming, but is a natural result of forgetting what it is like not to know something. Can you really remember what you were struggling with when you learned how to swim or ride a bike? Probably not. In any case, this section includes some practical advice for productivity in programming that will be beneficial to any data person.
Text editors and IDEs
Whether you use a text editor or an IDE (Integrated Development Environment) or both to programming is a deeply personal choice. If you don't already have a favourite you will naturally develop one or two over time. I do however think it is better to start off programming in a more minimal text editor, since you will absorb more of the language and less of the editor. An as friend of mine once said: "If the language you're using requires you to use an IDE what does that say about the language?". Food for thought.
What is the difference between a text editor and an IDE? I guess it is just a matter of degree. It is probably best that everyone decides for themselves exactly how they are different, but I would say that IDEs are just much more heavy. Heavy in the sense that they will have a strong influence on how you write code whereas text editors are more open to customisation according to your own preferences and as a default do not push your writing so much.
I think it is a good idea to have a basic competence in either emacs or vim - you will most probably have to do things on remote servers where the only editors will be either emacs or vim or vi or something similar. If you're a Mac user and emacs lover have a look at Aquamacs. Another popular text editor (my work horse and personal favourite) is Sublime Text.
Three of the most well known IDEs are Eclipse, IntelliJ and NetBeans. While I use IntelliJ for Java development they are all pretty much the same.
There are also two notable IDEs specialised for both Python and R. There is PyCharm for Python and RStudio for R. While I have never worked with either of them I am sure they are both well worth investigating. IDEs, especially RStudio, are not unlike the rich graphical user interfaces found in Matlab or Stata.
Ultimately it is not make or break which tool you use, the most important thing is that you get to know it really well and that it helps you get shit done.
Avoid the mouse
Why? Because you can type much faster than you can navigate and click. You have roughly 104 keys at your fingertips which can be combined with 10 fingers to produce a whole lot of keystrokes per minute. Way more than you can click a mouse and move the pointer around the screen. In fact, I would be quite happy if I were able to use my computer without my mouse altogether.
A natural consequence of avoiding the mouse will be that your typing will improve and that you will discover many keyboard shortcuts that will make you much more productive in the long term. Like many other activities, programming with data is a game of marginal and cumulative gains. Imagine how much time you will save over many years if you can do the small stuff faster...
Get to know Linux
Many analysts will earn degrees and gain work experience for many years without ever touching the Linux operating system. Being competent with the Linux OS, at least on the command line, is essential if you want to be a proficient data person with open source tools.
Thankfully it is not necessary to abandon your machine's current OS in order to start working with Linux. You can simply start up a virtual machine and run a Linux OS on it. Instructions on how to do this can be found in the github repo that hosts this wiki. You can read more about the tools used to do this in the section about the Vagrant VM.
Make the command line your home
The command line, via your terminal emulator (just another app), gives you a text interface to your file system and parts of your operating system. Interacting with files and the OS through this interface should be your default way of working. For Mac OS X I recommend using iTerm 2 - it has some fancy features like splitting windows that are very handy. I also recommend setting the background colour to something dark, and having text light - better for your eyes.
The command line section of the wiki shows and discusses some of the essential things you should know to make the command line you home. In the the rest of the wiki I also assume that you are comfortable with what is discussed there, and that you always have access to a running Vagrant virtual machine to replicate the examples.
Keep accessible notes
It takes time to learn things well enough that you can just do it without thinking or looking at a reference. Some routine things I'm sure I will never remember - and never want to for that matter. To help your brain, keep notes of small code snippets like useful command line tricks or code snippets for common problems. I have a directory called programming_notes
where I keep plain text files with helpful notes about things I can't remember, things I still want to learn and articles I want to read or keep a reference to. Being a sublime text user, I always have this folder open in my text editor and can jump to a file within a second. You should think about doing the same kind of thing.
Open source licenses
The Open Source Initiative has a sensible definition of what it means for software to be open source. They also provide a very good overview of widely used licenses. You will be using plenty of open source software and maybe even making some of your own, and knowing the licenses attached to different pieces of code is essential in this case. For an interesting discussion look here.