xingtwittersharerefreshplay-buttonpicture as pdflogo--invertedlinkedinkununuinstagram icon blackShapeGroup 3 Copy 2Group 2 Copydepartment_productdepartment_datascienceuserclosebasic clockblogShapearrows slim right copy 3arrows slim right copy 3arrows slim right copy 3

Jumping into Data Science: my internship at DieProduktMacher

DieProduktMacher

DieProduktMacher |

02. Juni 2020 |

- min Lesezeit

Jumping into Data Science: my internship at DieProduktMacher
After finishing my studies in maths and psychology to become a teacher, I decided to give it a try as a data science intern at DieProduktMacher. Here I'll tell you, how it went and what I learned working together with the Data Team!

What happened so far

When I started my internship at DPM, I knew very little about data science and machine learning algorithms. I had just finished my studies in mathematics and psychology to become a teacher when I met with an old friend who had been working in the data science field for years. The exchange with him impressed me and got me interested in what I understood to be a very powerful and slightly mysterious new technology.

A few days later, I met Andreas Franz, a senior data scientist at DieProduktMacher, and convinced him that I might make a good intern for the company he was working for. This is how I joined the DPM team. He now sits next to me, working on highly sophisticated Tensorflow code and reading papers from Indian data science experts. Meanwhile I have been working on the same damn SQL query for three days.

Jumping into Data Science

I thought that my mathematics degree would qualify and prepare me for a job in data science. Turns out it doesn’t. I spent the better part of last year studying for my finals, trying to understand Galois theory and trick out differential equations. Right now, I wish I had focused more on statistics and numerical methods.

Thankfully, Andreas and all the other nice people at DieProduktmacher are very patient with me. Before starting with machine learning algorithms and Python, I get a quick overview of the more basic skill requirements for data scientists. I am told about APIs and JSONs, a format that I initially disliked and tried to avoid. Compared to the neat tables I proudly copied out of the Postgres Database, they seem very cluttered and confusing. You see, I am at a point on the data science learning curve where long expressions with nested curly braces unsettle me quite a bit. I have yet to learn about the benefits of having data structured like that.

Git is next. Get that Repo, Lorenzo, another data scientist, tells me, and push your code in there. “Nessun problema”, I say, and head over to the bitbucket website. After I spend five minutes cluelessly clicking around in my browser window, Andreas lets me know that git is actually used from the command line and dictates a few lines that get me back on track in seconds. I begin to realize that coding is just one of many things a data scientist does. There is a ton of things to do and know apart from writing code, and all of it is required to make the wheel spin. Phew! So much to learn!

To be continued

A few days later I am very excited to start my first image recognition project. I’m now quite comfortable with JSON files and python scripting and downloading thousands of images in a few hours. For the first time in my life, I get a feeling for the huge amount of raw numbers that even a tiny picture is made of. The first architecture Andreas tells me about is VGG16. “That’s pretty old school”, another friend of mine says when I tell him what I’m doing. To me, convolutional neural networks don’t sound old school at all. 26” mountain bikes, the red edition of Pokémon and not having a mobile phone, those are things that I would use the term “old school” for. Obviously, things move faster in the world of IT. Still, learning about the basics of edge detection is nice, because telling people that I work in data science while thoughtlessly using imported models feels wrong. I wouldn’t call myself a cook after putting a can of ravioli in the microwave. The VGG16 approach doesn’t work as we had hoped, so we try the patented SIFT algorithm. It’s more than 20 years old, but now everything works like a charm. Amazing!

I realize how important experimentation, experience and optimism are in this job. I probably would have given up after not getting the results I had hoped for with the first method. Andreas, however, had seen it coming all along, already thinking about other promising approaches to the problem. Ar uni, I had gotten used to problems having one definitive solution and not having to worry too much if things didn’t work out at once. You don’t get it or your method doesn’t work? Don’t bother - wait until Thursday and the assistant professor will give you tailor-made answers to all your questions. Thank god for StackOverflow, I think! In my third week at DPM, the corona virus got us all work from home, which comes as a bit of a disappointment - I really liked the open and bright office. Luckily though, Andreas makes an effort to remotely look after me and spends a lot of time debugging my code, answering my questions and explaining stuff to me, so I’m confident that the next weeks will be as exciting as the last ones. I’m learning new things every day, be it about the raw mathematics happening in the background or best practices for seemingly simple, daily tasks.


Ähnliche Artikel

Ähnliche Artikel