Learning how to code can be hard when you start. Your brain is trying to apply an abstract concept to an abstract problem using datasets or projects that you’re not deeply invested in.
When you read tutorials it usually gives you sample datasets to work with, often public datasets that end up becoming the foundation of repeating code so that you can learn something. The ones I see most are:
- Titanic (data about survival, class, age)
- Iris (can be loaded from the R library “datasets”)
- MPG or mtcars (data about gas mileage by brand of car)
R has many automatically loaded datasets, which make them easy to practice with, some examples are listed out here.
The nice part of using these datasets is that you can easily check your work with the tutorial, and it’s not difficult to understand where you may be off.
The bummer about using these datasets is: it’s boring.
So, I would encourage new learners who are trying to get through coding and struggling to pursue it in two routes.
1. Where can you apply it in your work?
This is a great question because it means you’re getting real-time practice with work that is meaningful (hopefully). Plus, if you have a network at your job, you’ll likely get feedback or can use it to connect with others on what they’ve done. If you use code to produce outputs for presentations or client-facing work, you can also see how you’ve progressed over time. And if you’re anything like me, you can learn how efficient you grow in your coding by tracking your time with a program like toggl (which has it’s own analytics too!) and comparing it to a previous month when you’ve worked on a similar task.
2. What is interesting in your life, personally that can be captured through data?
Besides work, another space to grow your coding and analytics skills is in your personal life. What’s interesting to you that’s happening right now? For me, I’ve looked at public datasets around healthcare and visa timelines during the pandemic. Most recently, I’m diving into how often, level of coherence, and in what language I sleeptalk.
These can be really humorous spaces to look at data, but more importantly personally insightful. And of course, great spaces to practice your skills that don’t necessarily have a deadline. You can easily start collecting data personally through data that may be collected on an app (for example, I pay for sleepcycle and therefore have access to my sleep data, but there are many apps you may use for productivity or tracking your emotions or energy), or just observing your life. Maybe you tally every time your dog whines. Over time you might find certain things interesting — what time of day it is and how it resolves (food or a walk for example). Overtime this data that you collect can help you form your own hypotheses and try out code that you’re interested in working on in a way that you can directly apply to your life.
The best part is, that while at times your work may be protected in a non-disclosure agreement or you may not be able to share your work analytics for whatever reason — you can always display your skills in a personal portfolio.
So what are you waiting for?
- STHDA. (n.d.). R built-in data sets. STHDA. Retrieved February 13, 2022, from http://www.sthda.com/english/wiki/r-built-in-data-sets