Python’s simple structure has been vital to the democratization of data science. But as the field rushes forward, making splashy headlines about specialized new jobs, everyday Excel users remain unaware of the value that elementary building blocks of Python for data science can bring them at the office.
Join us for a conversation about bringing Python out of IT and into the business. We’ll share challenges and successes from writing tutorials, teaching classes, and advocating adoption among new users.
I really enjoyed the presentation and received a lot of positive feedback. As a result, I wanted to capture some of the ideas in a post so that the broader community could see it and generate some dialog on tips and techniques that have worked for you. The actual content in this blog is closely tied to our presentation but contain some additional idea and thoughts that I may want to expand on in future posts.
Finally, thanks to Katie for suggesting the topic and allowing me to partner with her on the presentation. I think having two different but complimentary backgrounds really helped the presentation cover a lot of different perspectives.
What are you trying to accomplish?
Before jumping right in to training everyone how to use python, it is important to understand what the goal is. This diagram shows where I think python fits into the IT ecosystem of a relatively large organization.
My experience is that Corporate IT functions are good at implementing large ERP-type applications or major cloud integrations like SalesForce. I suspect most medium to large organizations have some flavor of these applications in place (and likely many others).
In some cases, the IT organization will have a team to build custom developed applications using .NET or Java. Obviously, there will be a lot of Access and Excel sprinkled through the organization.
This distribution leaves a huge gap. There are problems that are not big enough for an off-the-shelf-solution or worth developing a full application. However they are critical enough that Excel + Access is not a good solution. Vicki Boykis does an excellent job of talking through this problem on a broader scale. I encourage you to read IT runs on Java 8 for a well written perspective on the problem of the hype of IT vs. the reality in many organizations.
I contend that python is an excellent candidate to fill in that gap and that it does not need to be solely the realm of corporate IT. “Super users” and other domain experts can (and should) be trained in using python to fix the problems that they face on a daily basis.
Python is almost 30 years old. Over its lifetime, it has always been known as a great “glue language.” It has gone through a phase where it demonstrated success as a strong language for web development with tools such as Django and Flask (and many others). Now more recently, it is widely used in Data Science. All of this power and flexibility leads me to wonder, why can’t we leverage it for other tasks within the organization that have not had much support from IT? We can do so much better for our people if we give them more tools besides Excel and VBA!
Pick Your Battles: People
If you have a similar experience and are interested in trying to use python to fill that gap, how do you proceed?
The first step is figuring out which people are good candidates for learning python.
My default profile is the person that is viewed as the Excel “guru” and has a strong understanding of the business process. In any group of 5-10 Excel users there always seems to be at least one person that knows the ins and outs of the Excel tools as well as the business problem. This combination can be a good place to start.
However, there are some people that are more interested in collecting a paycheck than trying to automate the boring stuff. Job insecurity is a real issue that needs to be factored into the people part of the process. It is important to emphasize what the benefits to the employee will be if they automate some of the mundane parts of their job. In my experience there is more than enough work to go around!
Given the rise of python in Data Science, there is likely going to be more general awareness of python than there was 10 years ago. One implication of this is that people may be more willing to agree to try python out. However, having interest in python is not sufficient. There is actually a bit more of a gap to get someone from “I can cut and paste VBA” to “I can cut and paste python” to solve my problem.
Unfortunately I don’t think there is a simple checklist to determine who is a good candidate to try to teach python. I do think that extra work outside of the 9-5 daily job is needed. If you embark on this process, you will relatively quickly get a sense for who is really willing to work at it and who is not.
There are lots of additional dynamics when trying to teach co-workers how to use python:
- Are people doing this only because the boss expects them to?
- How much outside of work time should be dedicated to the learning?
- Can people of different levels learn together? What if the “senior” person really struggles during the process?
Despite these potential “gotchas” the payoff for the organization can be very large. Think about how much could be done in your job if you had a team of 2-3 python-savvy experts that could help you out!
Adjust to your audience
I can almost promise that once you embark down this path of trying to bring python into your organization, you are going to have to adjust to the audience. Some people are going to be much more eager than others. The style that you used to learn python is likely going to be much different from your co-workers’ needs.
You will need to be prepared to adjust and take cues from the audience. In addition, there may be broader organization changes that cause you to shift focus. For instance, what happens if more departments are interested in your activities? What will you do if IT or management push back for various reasons?
In addition, keep in mind some of these thoughts:
- How would you scale if more people join?
- How can you keep people engaged as they move at different speeeds?
- How much “take-home” content do you need to provide?
- How much time can you carve out to teach?
Imitate Better Teachers
One of the biggest challenges with spreading python in an organization is that knowing python does not make someone a good teacher. In fact, the way you likely learned python is not the same way others will want to learn.
Once you start the teaching process, here are a few tips and tricks:
- Don’t be ashamed to bring in “better” teachers or others with python knowledge
- Bring in “lab assistants” to help with the minor gotchas
- Build a team approach so that it does not just fall on your shoulders
- Identify a blog, YouTube Channel or other resource that can be used inside and outside the trainings.
- Leverage any outside meetups/groups in the area so that the training can happen outside of work hours.
Learn with they’re trying to accomplish
It is really important to understand what the students are trying to accomplish. Do they want to move into a Data Science role? Do they just want to be more efficient in the current role?
They are likely excited to try to solve some sort of real-world problem in their day to day jobs. However, it is somewhat tricky to figure out the “right” problem to tackle first. You will need to steer them to solvable problems that they can maintain in the future.
Here are some reasons to automate a process:
- Save time - This is the first instinct but may not be the best reason.
- Get a quick win - Prove the value of python.
- Learn about the problem - How “solvable” is the problem?
- Development an improvement mindset - Get people to think about their problems differently.
- Process is boring or has lots of mistakes.
In addition, certain problems are better for python automation than others at this early stage in the process. Here are some characteristics of “good problems”:
- Large data sets - 100,000+ rows of data in Excel.
- Well understood problem - Focus on learning python not the problem.
- One step in a long process - Start with a manageable piece of the process.
- Text manipulation - Excel is used for lots of string manipulation tasks that python can do very well.
- Formatting of output does not matter - Focus on core data wrangling, not making it look pretty.
As a python expert, you may be tempted to start with the biggest challenges first. I recommend starting with something a little smaller - even if the time savings is small. There are going to be lots of hurdles and challenges outside of the specific problem. Do not bite off more than you can chew when first getting started.
People are going to be naturally skeptical so proving the value early is critical!
Pick your battles: process
There are lots of way to teach people how to use python. It is really important to think about all the available approaches.
How do you decide where to start? It is not trivial to find and curate the content for your specific needs. A lot of it feels like reinventing the wheel- which can be discouraging. However, that might be ok if it helps you learn more and build better content for your attendees.
As you search for sources of content, try to keep a blend of various content types:
- Formal online courses
- Custom developed lessons focused on your data sets
- Balance of formal lesson approach vs. real-world examples
- CS 101 concepts
Once you do start teaching:
- Reassure people that this can be confusing
- It is ok if you forget things. It is ok to google or refer to past examples.
- Part of the process is figuring out how to get “un-stuck”
- Gamification through chocolate, treats, swag is great for key concepts
The entire process is not just about teaching python language syntax. It is about teaching people to solve problems in a unique way. Shifting from an Excel-first mindset to a python-first mindset takes time.
One Final Note
During one of the talks at Data Tech, I had the pleasure of listening to Peter Wang, the CTO of Anaconda talk about the role of the Data Scientist. One of the the really interesting comments he made is that we are moving into a world where basic data literacy is going to be a requirement for success in the workforce. Today we don’t expect you to be a Math major to do math or an English major to write. In much the same way, data literacy is going to be required of many more people and is not going to be just for Data Scientists.
I firmly believe that we can and should bring in more tools like python into our organizations so that we can be more efficient but also start to increase the data literacy of the entire organization. These are two mutually benefical and reinforcing goals to keep in mind.
Five to 10 years ago, it might have been quite an uphill battle to try to bring python into your organization to solve your business problems. With the rise of python’s popularity in the Data Science world, you will have a much smaller hill to climb to convince others that python can help them solve their problems - even if it is not formal “Data Science.”
Once you get your organization on-board with the idea of using python, there is a lot of work to implement those ideas. This article includes a high level framework for thinking through the process:
- Know what you’re trying to accomplish
- Pick your battles: people
- Know what they are trying to accomplish
- Adjust to your audience
- Imitate good teachers
- Pick your battles: process
I hope you found it useful. I am contemplating building out some more content for a “Lunch and Learn” series. If you have any ideas, tips or contents that you have found effective, feel free to share any of your successes in the comments below.