Internship: Research Graph Foundation - Developing AI Pipelines

21 Oct 2024

This position is offered through the ANU Computing Internship ([COMP4820] / [COMP8830])

Company

Research Graph Foundation researchgraph.org

The Research Graph Foundation is a not-for-profit and collaborative venture to connect scholarly records across global research repositories. The Foundation contributes to developing capabilities that transform disconnected and siloed research activities into a connected network of scholarly works.

Project

Developing AI Pipelines

This internship focuses on developing source code and creating AI pipelines that use large language models to transform data into actionable insights. Interns will work on projects that emphasise the development of open-source tools and pipelines, using Python and Rust programming languages. The internship is designed to foster skills in building AI-driven systems, with a particular focus on optimising data workflows and integrating LLMs for natural language processing tasks.

Interns will contribute to open-source code repositories, ensuring that all developments are accessible to the broader AI community. They will be responsible for creating and implementing AI pipelines that automate the transformation of raw data into valuable insights. In addition, interns will explore the use of large-scale LLMs, experiment with AI models, and optimise them for real-world data challenges.

Finally, the internship involves presenting project outcomes in seminars, contributing to GitHub repositories, and writing technical documentation to make the code and methodologies accessible to others in the field.

There are three components defined as separate projects in this internship:

Project 1: AI Pipellines to classify emails: This project is focused on creating a retrieval augmented generated pipeline that reads a collection of emails and categorises them based on topic, priority, and other user-defined criteria.

Project 2: AI pipeline to classify images: This project experiments with AI models to group images based on user-defined queries such as the type of objects in the photos and background characteristics.

Project 3: AI pipeline to create topic taxonomy: This project creates an AI pipeline to classify articles into separate groups based on their topic. This pipeline will leverage open-source AI models.

These projects’ outcomes align with the Research Graph Foundation’s commitment to Open Science, and all produced materials and code will be publicly accessible under an open-source licence.

Required/Preferred Technical Skills

  • Essential: Experience in using Python for data analytics, and prior experience of working with large language models and prompt engineering.
  • Bonus: Linux environment experience.

Required/Preferred Professional/Other Skills

Ability to work independently and take initiative while knowing when to ask for help and communicate with others. Having curiosity and diligence are needed for a research project as the ability to collaborate with a small team.

Delivery Mode

Remote.

Type of internship

Unpaid Placement.

How to apply

Applications are invited from students who have already passed the eligibility checks for the Computing Internship courses COMP4820 or COMP8830. Further information about the Computing Internships can be found on the Computing Internship page.

You can nominate multiple preferred Internship projects and host organisations through the one application form.

arrow-left bars search times