The Group: Morningstar’s Research group provides independent analysis on individual securities, funds, markets, and portfolios. The Research group also provides data on hundreds of thousands of investment offerings, including stocks, mutual funds, and similar vehicles, along with real-time global market data on millions of equities, indexes, futures, options, commodities, and precious metals, in addition to foreign exchange and Treasury markets. Morningstar is one of the largest independent sources of fund, equity, and credit data and research in the world, and our advocacy for investors’ interests is the foundation of our company.

The Role: As a Data Scientist, you will be a leading contributor in the implementation of Artificial Intelligence (AI) within Data Collections software applications, API’s, and other data products. This role requires significant interaction with both upstream and downstream stakeholders across Technology, Data, Products, Sales/Service, and Research.

The Data Scientist will transition approved Data Collections AI products from a prototype phase to a fully-fledged, scalable, and consumer service. Often, these services must be integrated into Morningstar’s platform of financial products, so that our clients can use these software tools in the investment decision-making process.


  • Design & develop enterprise solutions to be flexible, scalable & extensible.
  • Improve complex data flow, data structures and database design to move to next platform.
  • Be a Role Model to the team to collaborate on good object-oriented designs & domain modeling. Enforce good agile practices like test driven development, Continuous Integration.
  • Build solutions that incorporate numerical techniques such as linear algebra, machine leaning, statistics, and optimization.
  • Hands-on development will be an integral part of the responsibilities.
  • Develop areas of continuous and automated deployment.
  • Introduce and follow good development practices, innovative frameworks and technology solutions that help business move faster.
  • Follow best practices like estimation, planning, reporting and improvement brought to processes in every day work.


  • 1-3 years of experience in data science field
  • An advanced degree in engineering, computer science, statistics or related field is preferred
  • Expertise with either Python or Java is essential, while experience with both is desirable; other programming language skills are highly desirable. Experience with Python packages like pandas, scikit-learn, TensorFlow, numpy, NLTK is a plus.
  • Basic knowledge of statistical/ML/AI algorithms
  • Experience with DevOps tools (e.g. Splunk, Git, uDeploy, Jenkins, Control-M) is desirable
  • Experience with Agile software engineering practices
  • Experience with back-end XML, relational, and file-based databases (e.g. SQL, Postgres, Redshift, Netezza, HDFS)
  • Experience developing and deploying solutions using services in the Amazon AWS ecosystem (Lambda, EC2, RDS, EMR)
  • Experience with the Hadoop stack (MapReduce, Pig, Hive, Nifi, Spark) is desirable
  • Experience with at least one statistical modeling language (e.g. R, MATLAB, Python)

