Breaking into Industry ML/AI Research Without a PhD
February 27, 2022
I am currently an Applied Scientist who is doing full-time machine learning (ML) research at Amazon without a PhD. I get to work on intellectually difficult problems with strong potential for greenfield innovation, work with really bright and motivated people, and earn high industry pay¹ while doing what I love. Unfortunately, while a lot of people are interested in entering machine learning, there isn’t a lot of guidance online for those trying to transition into ML from software engineering. This post aims to bridge that gap by describing my journey from college to software engineering to becoming a machine learning researcher. While the details of this post are specific to my experiences, I believe there are general takeaways that are applicable to everyone, which are summarized at the bottom of the post.
College (2015–2019)
My college journey was highly non-linear and led to a late start in machine learning research. I entered Princeton with a strong interest in computational biology research, which I pursued until halfway through junior year. At that point, I realized I was overspecialized in the biological domain and I wanted to instead develop skills to solve problems across many domains. I thus became interested in machine learning and in particular, computer vision as an impactful real-world application of machine learning.
However, I had only taken ML classes at that point and lacked practical experience. My three internships consisted of one summer in bioinformatics and two summers in software engineering. My junior independent project was also in bioinformatics. Additionally, Princeton at the time only had a few faculty members doing research in computer vision and deep learning, none of whom had space for more undergrads in their labs.
Luckily, I was able to connect with a newly hired assistant professor during the summer before my senior year, and I became one of the first students to join his lab. During that summer, I taught myself deep learning through reproducing papers and participating in Kaggle competitions such as humpback whale identification, as well as reading literature to brainstorm for my senior thesis. When I met with my advisor for the first time, I told him that I wanted to work on a project that had publication potential, as I knew that I would need publications whether I applied to grad school or tried to recruit for ML positions in industry. He thus assigned me to work with a senior PhD student on a 3D vision project. While I was completely new to the field, I quickly got up-to-speed and through hard work (I averaged 25 hours of research per week with a full course load), I was able to make significant contributions during my senior thesis. Right before my graduation week, we submitted a paper to NeurIPS. This paper unfortunately got rejected but after some additional work over the summer, our paper was later accepted at CVPR² which is another top computer vision venue.
First Job at Amazon (August 2019)
My senior thesis experience revived my passion for research and I decided I wanted to do machine learning research as a career. However, I had already accepted a new grad offer at Amazon for software engineering, and I didn’t have any leverage to recruit for ML research positions. The CVPR 2020 paper from my thesis work had not yet been accepted, so my only publications at the time were in biology and physics. I had no concrete evidence to prove that I could contribute meaningfully to a ML research team.
Thus, I shifted my strategy to recruiting for a machine learning infrastructure team, which I planned to use a stepping stone for research positions after I gained some work experience. My rational was that my familiarity with the machine learning software ecosystem would help me standout compared to engineers who mostly lack research experience, while experience with delivering features end-to-end would help me standout compared to scientists who lack production experience.
Although new grads at Amazon typically have no control over their team placement, I cold-emailed hiring managers and managed to network my way into the cloud machine learning division of Amazon Web Services (AWS). I joined the SageMaker team, which sells an all-in-one cloud platform for machine learning. At SageMaker, I worked for one year on a service that reduces the real-time inference cost and latency of deep learning models. I contributed to a key feature launch and authored an accompanying official AWS blog post ⁴.
Turning Point (June 2020)
Shortly after, my CVPR paper from senior thesis work² was accepted and published. Thus, roughly one year after graduating college, I had both a top-tier ML paper and a feature launch under my belt. Things had gone according to plan, and I now had enough leverage to begin talking to ML research teams at Amazon.
Ironically, I started to get cold feet and second-guess my plan because I was on the cusp of getting promoted to software engineer II. As my current team had no research scope, I would have to switch to a research team if I wanted to do research, but switching teams would reset my promotion timer. As my friends and peers were starting to get promoted at the time, I could not stomach the thought of being the very last person to get promoted. I also began to doubt whether I would be able to succeed in machine learning research without a PhD.
I setup coffee chats to get advice from both Amazon scientists as well as engineers with research background who ultimately chose to commit to engineering. Most of the conversations were quite discouraging and sowed further doubt in my mind; many told me that my only option was to go back to school and do either a research-based master’s or preferably a PhD. On the other hand, I also felt encouraged by a smaller handful of people who told me that internal transfer was possible and that they knew people who had successfully done it. Overall, these conversations stirred more doubt and made the stakes feel higher — if I failed, I would not only have thrown away an early promotion opportunity but also “wasted” a couple years of my life.
But after self-reflection, I realized my concerns were largely superficial and trivial. Risky bets such as switching career paths are easier to do earlier in life when one has fewer personal responsibilities such as a family to provide for. I realized I wouldn’t regret failing to succeed in research and then switching back to software engineering, but I would heavily regret not trying ML research as I believed myself to be capable of succeeding. I still couldn’t stomach the thought of delayed promotion due to my competitive nature, but luckily I was able to recognize that career progression is a marathon and not a race. Investing in long-term success over short-term success would not only make me happier, but also better equipped for a future in which ML will only become more ubiquitous.
Thus, I began looking for ML research teams internally, and connected with a manager who had just founded a new team in Prime Video working on machine learning for understanding videos. As I didn’t have a master’s nor PhD degree, I was not allowed to directly interview for the Applied Scientist role, but I could interview for research engineer. I thus did an internal interview loop for research engineer which involved one ML algorithm round and one coding round. During this process, I encountered a surprising amount of degree bias; though I had passed both rounds, I was later told that a few people including my manager had raised concerns over my lack of a master’s and PhD. This is a theme I will expand upon in the remaining sections.
Nevertheless, my team transition was approved and I became the team’s first hire. I committed fully to my goal of becoming a ML research scientist.
From Research Engineer to Scientist (July 2020 — May 2021)
Upon joining Prime Video as a research engineer, my next step was to transition internally to become an applied research scientist. As Applied Scientists at Amazon mostly have PhDs, I needed to demonstrate the ability to independently do research at the level of a PhD student or strong Master’s student. I would then need to summarize those research projects and gather peer feedback in the style of a promotion document in order to apply for a role transition. Most Applied Scientists are hired externally and very few engineers successfully make the transition internally, so the odds were stacked against me especially due to degree bias. At times it felt like I was shooting in the dark as there were very few people who could guide me through the process.
Through reasoning about the expectations of a research scientist and assessing my current gaps, I created a roadmap towards doing less engineering projects and doing more ML research in my day-to-day work. However, executing this plan proved to be more difficult than anticipated. Although my manager was supportive of my transition to become a research scientist, company performance expectations made it difficult for me to pursue my career goals. If I took too much company time to do research projects as an engineer, this would naturally lead to me having fewer engineering deliverables compared to other engineers. This would be pretty detrimental from a performance evaluation perspective.
Finding the right balance was a delicate process that required frequent discussion of expectations and goals with my leadership chain, as well as investing significant time outside of work to work on research projects. Luckily, I pursued this transition during the peak of COVID and thus had far fewer distractions and social commitments than usual. Without working longer hours, I likely would not have been able to gather enough research datapoints and peer feedback for transitioning to Applied Scientist.
One of the research projects I worked on was with a principal scientist from a sister team, who was able to provide feedback supporting my transition document. I made novel contributions and helped develop a state-of-the-art self-supervised model for movie scene segmentation, which led to a second CVPR paper³ and company-wide keynote. I then productionized this model and deployed it. Following this, I wrote a transition document and passed a technical assessment with a senior scientist. Even though I had all the deliverables to prove that I met the expectations of an Applied Scientist, my transition was still delayed due to internal review of my document. In May 2021, my transition to Applied Scientist was finally approved — 3 years after my entry into machine learning and 11 months after I joined Prime Video.
As an Applied Scientist now, I lead forward-looking research that is likely to yield new features and optimizations for products such as Prime Video. I spend roughly half my time developing and then productionizing machine learning models that power new features, and spend the rest of my time doing publishable research and writing papers. Compared to when I was an engineer, I now have more autonomy and ownership over the direction of my work, which gives me a greater sense of fulfillment. ML research is my dream job for now and I am incredibly lucky to get paid to do what I love.
Practical Advice
If you are a student and already know that you are interested in ML research, the best thing you can do is get research experience and coauthorship (or even lead authorship) on published papers. Speak to faculty you are interested in working with and see whether there are any projects that require help with running experiments. Sometimes faculty are too busy to reply or mentor you directly, so in this case you can try finding a grad student who needs help with their projects and wants to work with you. Taking additional classes is helpful but yields diminishing returns, since most knowledge required to do research is highly specific and is best learned by reading relevant papers and actually doing hands-on experiments. If you are already close to graduation and don’t have research experience, it could be worth it to do a 1–2 year master’s program, but only if the program is research-focused (usually with a thesis requirement). A class-focused master’s program is not a productive use of time when you could be gaining experience and money through working in industry. Whether to do a PhD is a complicated topic that deserves its own post, but essentially I do not think a PhD is necessary for career progression within industry research. A PhD is only a good idea if you want highly focused time to work on a very specific problem, and have interest in becoming a professor afterwards.
Similarly, if you are in industry and don’t have research experience, one option is to go back to a research-focused master’s program. You can also try doing a lateral switch internally as I did by first joining a research team as a software engineer, and gradually earning more scope to do research work. It is generally easier to transfer internally than switch roles externally when you don’t meet job role requirements for academic pedigree, because recruiting systems are highly automated and optimized to minimize false positives (but not false negatives). The downside is that it takes time to build connections and earn trust when growing inside a company. One point in your favor is that engineering experience is valuable for scalably iterating on experiments, and often will give you a leg-up over pure scientists especially in empirical domains.
Regardless of where you are, you will need compute to do ML research, which is increasingly compute-intensive. If you don’t have access to an academic or industry computing cluster, I would advise building your own PC (see this blog post ⁵ by my friend Tim Dettmers at UW) and doing side-projects such as ML conference hosted competitions ⁶ to get up to speed with ML and build a portfolio. I think Kaggle competitions are also a fair option for learning, but not ideal for building a portfolio as the projects are often less relevant to academic literature and more focused on real-world usage — in contrast to competitions hosted at ML conferences such as NeurIPS.
Concluding Thoughts
When reflecting on my time so far at Amazon, I can think of the following major lessons:
- I am the only person who can and should own my career. As a new grad, I didn’t understand my manager(s) incentives and often resented them when they didn’t give me the projects I wanted. What I didn’t understand was that I should never expect anyone to go to bat for me. I needed to go bat for myself and create the opportunities I wanted.
- Luck is when preparation meets opportunity. While I was fortunate to have the right opportunity appear in Prime Video, I was also prepared to make the most of that opportunity through my hard work and networking.
- Rules are rarely written in stone. While PhD is a hard requirement on machine learning research job postings, degrees are just a proxy for ability. A PhD signals that someone is likely capable of doing independent research, but there are plenty of people who don’t have a PhD and do amazing work. At the end of the day, the only thing that matters is whether you can get the job done or not. A degree can make it easier for you to get hired, but once you’re hired, no one cares what degrees you have. When in doubt about requirements, just try to think from the hiring manager’s shoes about what the functional responsibilities of the role are. In my opinion, the same applies to MBA and other professional degrees.
- Do what you love for a living. Some people advocate working for a living and then doing what one loves outside of work. While I think this is an equally valid direction and admire those who are able to pull it off, I found it challenging. When I first started at Amazon on a non-research team, I tried to stay up-to-date with research papers after work, but it wasn’t sustainable because work already demanded a lot of my mental and physical energy. I thrive when my passion intersects with my work.
¹ Entry level ML researchers in industry can expect to earn around 200K, and senior researchers can expect well above 500K.