Jayashree is a seasoned professional with vast industry experience which includes positions in design, manufacturing, project management, and software development. Over the years she has consolidated knowledge and expertise in software development (migrated from Fortran/Pascal/C days to j2ee/python/android world) and currently pursues a deep interest in providing j2ee solutions, data science applications, mobile solutions in Android. She seeks and enjoys solving technically challenging software problems.
Jayashree Javi’s Introduction to Data Science
How did you begin your career in data science?
I was fortunate enough to work in a company which had employed Big Data Engineering and Data Science way before this term became popular. I worked in an Information Retrieval company where processing millions of documents for Search and Retrieval every day was the norm. We were a Java/JEE shop and we had to apply Big Data Engineering and Analytics techniques every now and then to generate reports from the massive amount of log files for usage analytics and were using Hadoop when Hadoop was not that famous.
Subsequently, I was tapped for teaching and training opportunities at my alma mater, Wayne State University. I started teaching on the side which became popular and subsequently I started teaching full time. Then I received opportunities to train Corporate Software Engineers on using Amazon Web Services (AWS) — run Sentiment Analysis, Recommendation systems etc., on Elastic Map Reduce (EMR).
Then the wave of Data Analytics pulled me over to understand Exploratory Data Analytics using Jupyter. Since Python was more popular than R, I chose Python and used NumPy, Pandas, and Seaborn to create visualizations. I have been teaching all of these concepts currently.
What do you feel are some common misconceptions about data science or your work in general?
Data Analytics is different from Data Science which is different from Big Data Engineering. All are different with different skills required. For Data Science a person needs to be good in writing and applying various Algorithms. Exploratory Data Analytics is the first step in the journey of Data Scientist and it is a low hanging fruit which can be reached by anyone good in Algebra and numbers. When you see everyone talking about AI, Machine learning, self-driving cars etc., many get intimidated by the sheer complexity of this subject matter and have a misconception that they cannot achieve it. This is not true. There is a path to get to the highest complexity and the journey is not very difficult and in the process, each one may realize in which role they fit in better and then can pursue that role.
What inspired you to learn more about Data Analytics?
As such Science and Math were my pet subjects in High school. And applying Science on Data was a natural progression after having worked in a Big Data company. After working on a few Exploratory Data Analytics it was indeed intriguing to find correlations between variables that I never thought about before. Many start with exploring Titanic Data set and I also did the same. It was indeed intriguing to find that 1st class passengers had a higher chance of survival than 3rd class passengers in Titanic disaster and babies survived better than others. Subsequently, I explored many datasets and each one had a story to unearth. It is fascinating!
What value do you hope your research into Data Science will eventually provide?
My focus has been to make Data Analytics/ Data Science and Data Engineering easy enough so that anyone good in high school math can penetrate the subject matter. To that end, the first step is to learn Python. When learning Python there are many resources on the web but I wanted a simple and effective material which I compiled and published at http://ebooks.mobibootcamp.com/python/index.html
So far all the students who referred to my book have found it easy to take their first step and feel confident to take their next leap into Data Analytics world. I’m planning on releasing more such books to take anyone interested in Data Science to the highest level they can achieve by adding complexity gradually.
What are some tools that you use to conduct your research?
I’m constantly engaged in teaching students on the subject matter and employ a hands-on approach to learning anything. By solving problems hands-on a student gains the tremendous confidence to move further. I also use Google and Amazon cloud system heavily and teach my students how to leverage these platforms for Big Data analytics. Open datasets are in plenty these days and every student can choose a dataset which they like to explore. I allow them to use any dataset that they like. This way they have fun learning and positively reinforces their motivation.
What do you look for when hiring Teaching Assistants and Research Assistants?
I look for same passion for teaching and making things easy for students to understand. Keeping things simple has been my focus.
What advice would you give to students who aspire to be data scientists?
Keep an open mind. Although my material gives you a head start, once you get enough confidence, explore online resources and learn the many facets of this subject matter. Solve problems along the way and publish your solutions. That way you can show that you too can do it! I would suggest choosing Python as Python is easier to learn and also more popular than R. Refer to my ebook http://ebooks.mobibootcamp.com/python/index.html And if you have any feedback do not hesitate to let me know. My aim has been to keep ebooks simple and effective at the same time.