Data science as a career

Waverer

Robin
It's something I am considering seriously and I wondered if anyone here knows much about it, it's pros and cons.

I am mid 30s and work in finance. Just finishing a professional finance qualification. I gather I would need to learn Python and SQL as well as get very good at statistics. What else is there to be aware of, in terms of the job market, job opportunities, what the work is like and so on?
 

Waverer

Robin
Thanks - how realistic is that? I am more thinking spend 6-9 months learning in my own time (and working out if it's for me) but the principle is the same.
 

kel

Pelican
It's what the bootcamps claim, and at different jobs I have unfortunately ended up having to work with bootcamp grads to some degree. They are, generally, not great at what they do though I'm sure there are exceptions.

Learning on your own and cultivating the craft will undoubtedly give you better results. The benefit a bootcamp offers (again, ostensibly, who knows how much of their marketing is bullshit) is that they'll plug you in to a network of companies looking/willing to hire from them. Some bootcamps give guarantees of jobs afterwards and/or set up something where they take x percent of your first y paychecks rather than making you pay up front.

The huge proliferation of these bootcamps makes me think the market has gotta be saturating, combined with the crash in expectations as companies realize "data science" is not a magic bullet for their problems any more than "social" or "big data" or "AI" or whatever "one weird trick!" type mania there is at the moment. Further, tbh, a lot of these bootcamp grads tend to be women and companies are happy to be able to hit some diversity goals, so your mileage may vary if you're not a visible minority.

In any event, I do definitely suggest you "learn to code" in some capacity, especially if you can do so in your off hours (or even better if you can get your current gig to pay for it and set aside time for it). Data knowledge combined with experience in finance will probably be an attractive combo for a while yet. When the time comes, I suggest you aim for employment in a boring, finance-oriented company (like a high frequency trading firm) that makes fistfuls of money but has no cool factor, rather than a startup that has the cool factor but might go under, because as low man on the totem pole and as someone in a tangential position (as opposed to a core engineer who's needed just to keep the thing up and going) you could be relatively precarious. That said, you might hit the lotto, and you can still make a stack before they do go under, so just be prepared to have to move if you go that route.
 
How important is an understanding of MRP software like SAP, Navision, AS400, ect.? Would a basic CS understanding (not including a Data Structures class) as well as understanding of the previous software get you anywhere in the field?
 

Waverer

Robin
It's what the bootcamps claim, and at different jobs I have unfortunately ended up having to work with bootcamp grads to some degree. They are, generally, not great at what they do though I'm sure there are exceptions.

Learning on your own and cultivating the craft will undoubtedly give you better results. The benefit a bootcamp offers (again, ostensibly, who knows how much of their marketing is bullshit) is that they'll plug you in to a network of companies looking/willing to hire from them. Some bootcamps give guarantees of jobs afterwards and/or set up something where they take x percent of your first y paychecks rather than making you pay up front.

The huge proliferation of these bootcamps makes me think the market has gotta be saturating, combined with the crash in expectations as companies realize "data science" is not a magic bullet for their problems any more than "social" or "big data" or "AI" or whatever "one weird trick!" type mania there is at the moment. Further, tbh, a lot of these bootcamp grads tend to be women and companies are happy to be able to hit some diversity goals, so your mileage may vary if you're not a visible minority.

In any event, I do definitely suggest you "learn to code" in some capacity, especially if you can do so in your off hours (or even better if you can get your current gig to pay for it and set aside time for it). Data knowledge combined with experience in finance will probably be an attractive combo for a while yet. When the time comes, I suggest you aim for employment in a boring, finance-oriented company (like a high frequency trading firm) that makes fistfuls of money but has no cool factor, rather than a startup that has the cool factor but might go under, because as low man on the totem pole and as someone in a tangential position (as opposed to a core engineer who's needed just to keep the thing up and going) you could be relatively precarious. That said, you might hit the lotto, and you can still make a stack before they do go under, so just be prepared to have to move if you go that route.

All great advice. Would my age (mid 30s) be an issue? I feel like I would bring relevant prior experience but maybe as far as data science goes presumably be starting at the bottom. I get the impression it's pretty meritocratic once you're in so hopefully I could work my way up faster if that experience really is relevant.
 

kel

Pelican
Your age shouldn't be too much of an issue if you stay in the financial realm (moving to fintech). Stress your accomplishments in the first part of your career, then say something like "I was bumping up against the ceiling, I realized the only room I had to escape the limitations of finance and to improve and grow further was to study data science, adding that skill on top of my financial foundation". It's all in how you sell it, natch. That said, anything you can do to get something relevant on your resume would help you, anything at all kinda "data-sciencey" that establishes a little bit of experience for you and which you can use to weave a narrative around.

Regarding my second comment, I just mean that the tech industry has been ridiculously hot for a while now, a sort of mania where companies are burning cash and have no plan to ever be profitable, but this almost religious belief in "user data" being inherently valuable keeps the illusion going. Maybe it can go for another several decades, I dunno, but sooner or later there will be a pop, and I'm betting its sooner rather than later (the rona, for instance, is helping. Suddenly actually making a profit doesn't seem so quaint to investors, and therefore the above mentioned companies who have no path to profitability could easily have trouble continuing the fundraising gravy train). So, a dorky but profitable engineering-one-point-oh kinda place, in your case probably a cubicle-y office building company with an interchangeable "Intertrode" style name that does whatever kind of work in finance that has no cultural allure but makes actual money (selling services to other, equally unsexy businesses probably) might be a safe harbor. A lot of people are trying to "thought leaders" at places like Uber or whatever, and the more connected social strivers amongst them will be able to continue milking it, but a lot of the wanna-be-petite-noblesse won't make it when the chips are down and the patronage class has to start trimming the fat in their ranks.

All that to say: finance is quite literally where the money is, so if you can provide a skill that provides real, material value that will probably be safe in a way "director of social media" at whatever startup is hip today and gone tomorrow will not be (unless you have class connections that will move you to the next sinecure).
 

homersheineken

Kingfisher
I'm currently in the process of doing this - and I"m older than you.

Data Scientists are very much needed now, because they require a number of skills. And the skills aren't easy. You need to understand programming, building models, SQL, statistics, data structures... You also need to have good people skills to be able to understand the business needs and then extrapolate that into building the models and data the consumers want (not necessarily what they're asking for). I don't think it's just a fad, since nearly every business could benefit from what can be provided. Just like how every business benefits from having a good website, every business could benefit from being able to predict when a customer may leave/stop buying (customer churn), or how different demographics react to your business or.... the questions are endless.

Note, I already have a good background for a lot of the skills needed. I'm a web developer and have a good background in programming in a few languages (not python). I also have a background in SQL, (graduate level) statistics and Google Analytics.

Work is paying for my training (haven't chosen one yet) and wants to assume more of this work as our site grows. I'll also have a couple of people to mentor me.

I can't comment on these bootcamps. I know they can help in programming jobs, but Data Science is much more technically diverse, so I'm not sure if you can just walk into it like you can with learning a programming language.
 

Easy_C

Crow
It’s a lot harder because you need to have some semblance of mathematical comprehension. There are a lot of similar roles that are a good fit though. Right now the corporate world is obsessed with “big data” and “cloud”’is hoping to mine massive amounts of data looking for some small thing they can find to make their marketing brainwashing even more effective, so there’s a lot of space for people who can combine coding with a competency in concepts and math from a business management specialty (e.g. marketing).
 
I'm currently in the process of doing this - and I"m older than you.
Can you share a bit of how your path ended up in Data Science Homer?

My work has been curtailed somewhat and am exploring new opportunities. I had some tertiary experience with condition based monitoring that touched on much of what you say. We had a parallel team modeling energy infrastructure and creating digital twins for analytics. My role was mostly developing the hardware and determining bandwidth needed for the massive amounts of data being sent to the cloud that we would then remote monitor. We relied on gov't contracts and the industry was slowed considerably the last few years; I left to return to manufacturing. I can't say I find the new work particularly interesting. Doing vibration analysis and acoustic monitoring with some of the guys at the last job had more of the "you're not the smartest guy in the room feel". I like those kinds of jobs better.

If you don't mind answering, what industry are you doing this training in? I would prefer to stay more R&D oriented than making a move to the financial or marketing world.
 

homersheineken

Kingfisher
The industry is medical device mfg, but I don't think it's relevant since all industries could use it.

I've always been into computer, information and data - esp data visualizations and building them out. Not mention work in the social sciences using statistics. This just seemed like a natural course for my career since it's adding on to many of the skills I already have. When you factor in the scarcity and the pay, it makes sense.
 
How important is an understanding of MRP software like SAP, Navision, AS400, ect.? Would a basic CS understanding (not including a Data Structures class) as well as understanding of the previous software get you anywhere in the field?

SAP and Microsoft Dynamics Navison are examples of Enterprise Resource Planning (ERP) systems. I think SAP is the most common and Oracle ERP is in second place.

The AS/400 is basically the last of the mainframes from IBM, usually used for highly robust legacy applications. Mainframes are still around: I used to work at one company that, as of 5 years ago at least, was still using a room full of VAXs. There is a skill set to getting data out of some of the old 4GL systems on those machines.

MRP is Manufacturing Requirements Planning, which is a method of "exploding" bills of material against orders, putting the requirements into "time buckets" then adding planned receipts and and current inventory. A fairly simple concept. MRP II takes manufacturing capacity into account. MRP you can do in Excel or Access given the data. MRP II you would be better off going with an Advanced Planning and Scheduling system add-on to the ERP system. The ISA-95 standard lays out typical connections between MRP, ERP, APS, etc.

Regarding data science overall, I suppose my work in the past has overlapped that field somewhat. The issues I have with it are that 1) it is a bit trendy, and today's trend risk becoming tomorrow's cost savings (e.g., I have seen entire Six Sigma groups let go), and 2) it seems that most of data science is processing large amounts of data to find correlations -- if that is incorrect, that is at least a perception of it some people have. Correlations should be only the beginning, not the ending, of data analysis. Correlation does not mean causation. And even when it does, people may incorrectly assume the relationship of inputs and outputs is linear, when it often is not. In other words, finding the connections is cool, but often does not result in anything that can be put into action.

As an example, a company I worked for a while back tried some software that was supposed to process vast amounts of data and find what was causing some machines to go haywire (and when they went haywire they tended to catch on fire). I did the back end database work to consolidate and then export really large data sets. Some of the data came from process historians that had vast amounts of data. The big finding was that there was an electrical signal out of the thousands of data points captured that 100% correlated to the machines catching on fire! Sure was, it was the automatic fire system going off after the fire started. Kind of like a hospital discovering that people dying was correlated with phone calls to the morgue. So take away the phones and no one dies? That software never did find anything useful.

There were interesting things we found in the data at that company just using SQL queries and MiniTab, but it always involved an understanding of the underlying manufacturing process to make sense of it--just crunching data did not accomplish anything.

Not trying to discourage you from trying the field out, but try to be well-rounded and do more than find correlations in the data. If you learn the industry you are in, learn the company you are at, work with one or more departments really well, and you have good IT skills you should do well.
 
You have a very similar view of what would be interesting Jacob. There seems to be three disciplines (and probably more in the Data Science field). There's obviously potential in the financial field; that's not really my field of interest. Then you have the marketing and Business Intelligence angle. I've used SAP in a greater capacity beyond requirements planning. We had to pull install base, customer contacts and sales history from SAP and Navision (working across divisions hence the two systems) to formulate sales strategies. I know some of the finance guys were also analyzing profit centers and product lines via SAP (maybe more of the financial side).

I do think there is something to Condition Based Maintenance. You're right that some of the analysis is rather trivial in its conclusions.
There were interesting things we found in the data at that company just using SQL queries and MiniTab, but it always involved an understanding of the underlying manufacturing process to make sense of it--just crunching data did not accomplish anything.
The project I was on was heavily into the Digital Twin concept and deviations from its baseline measurements. We found (similarly non-mindblowing) that without heavy data documenting abnormalities, we couldn't gain much more insight than current Predictive Maintenance strategies. Maybe looking at the combination of correlation and covariance across broad operating conditions and determining independence of the variables. This topic leads more into machine learning however and I suppose that is another "kind of connected" subset of Data Science. Nevertheless, development was going into compiling data across all the install base for similar products...I left that place several years ago and I am a bit curious where their models have progressed. I guess I asked about MRP/ERP software since on this side of the business we really didn't use that data, it was almost all sensor related...for other aspects though it seems essential.
The industry is medical device mfg, but I don't think it's relevant since all industries could use it.
This sounds like another interesting industry...it sounds like we have R&D examples from energy, healthcare, and manufacturing which is promising.

It does seem that the more R&D Data Science positions are more couched in typical engineering roles. I'm doing a job search now and I don't see anything strictly labeled "Data Science" in my searches although some job descriptions obviously involve that. I'll certainly post here any interesting positions I get to interview for. Like you said, Jacob, I think I need to find things that cater to my skillset. I couldn't compete in the strictly statistical analysis jobs.
 
Last edited:

Stadtaffe

Sparrow
Gold Member
Get "very good at statistics" is not necessarily the case. There are several branches within data science, maybe about three. Data engineering, which is a sort of plumbing job where you transform and move it around, set up jobs to run in the small hours of the morning where SQL and Python or MongoDB and Java together carry out the job for example, and worry the next day if it ran or not. Then data science, which would say is machine learning algorithms. Some data cleaning and preparation then algorithm selection and tuning. Neither of these two need statistics really, not often. These days the algorithms may include image recognition, computer vision. Then data analyst, which involves the statistics you are worried about, minitab etc.
 

Waverer

Robin
So I've made a start now, working my way through Eric Matthes' Python Crash Course. It introduces all the core elements like Lists and If Statements and then three simple projects - a basic computer game, data visualisation work and a web application.

I think I also need to learn SQL, which I gather can be a matter of a few days. Can anyone recommend the best way to learn it?

Once I am done with the Matthes book, does anyone have any recommended resources? It could be a book on more advanced data science techniques (seems like Matthes just goes up to what I can already do in Excel). Or a big web page or online course.

I think my ultimate goal in the next few months is not to learn how to make web applications but to really master data visualisation and being able to produce outputs that can't be done in Excel.

Any specific advice much appreciated.
 

kel

Pelican
I don't know offhand, but I've seen SQL "games" so to speak - given this table, get this output, increasing in difficulty forcing you to do creative aggregations and window functions and such. Seems like a good way to do it.

For what it's worth, (and I'm not shitting on your idea at all because I love SQL and I judge engineers heavily based on whether they're competent in SQL - at least understanding the concept of relational databases - and those who obviously don't understand it and try to treat the database like a hash table in application code) the data scientists I've worked with have generally not been very good at SQL, sometimes just not knowing it at all.
 

kel

Pelican
Tough to say. I'm self-taught (mostly, I did go to university but that was just to get the paper, I was already programming and making money at it even) and I started years ago when I could just poke around on QBASIC. I wouldn't recommend that as an efficient way of going about it, though.

I think you're probably doing right by picking a book - don't labor over the decision, find something reasonably well reviewed and available and just go with it - and working your way through it. Get comfortable with the (a) language first, get to the point where you can do things (in general. Just, like, you have an idea for a thing, you think "yeah, I know how I'd go about doing that) and program daily to stay sharp and develop. It'll take time. In an ideal world you'll be able to get a lower-level job quickly and then you can learn while getting paid for the privilege, and get an inside look at how real stuff is done in industry (for better or worse).

If it's any consolation, your average programmer/data scientist out there is utter, utter shite, not really qualified to even pump gas let alone be earning six figures pumping out the buggy garbage they do. That sucks, but just to say: don't worry about your imposter syndrome too much. Just focus on the process.
 
Top