How to Set up an Institutional Repository In 5 Easy Steps
Open Access Institutional Repository

How to Set up an Institutional Repository In 5 Easy Steps


If you ask around what's the easiest way to gather information these days, most people will suggest 'search online'. The rationale is simple — it is easy, reliable, and available at your fingertips. Due to its ever-growing popularity, you must ensure your institution's research output can be found online for relevant search queries. Not only does it boost traffic, but it will lead to more visibility and citations. But how do you make it happen? Building an institutional repository is a great starting point.

An institutional repository is an electronic archive to store and preserve the research or scholarly output of an institution. Over the years, as internet proliferation skyrocketed, institutional repositories have been forced to evolve beyond a simple archive. It should be the focal point for disseminating your research output online — that means meticulously organized, easily accessible to readers, and search engine friendly. Building a complex structure like that from scratch could feel daunting, but it doesn't have to be.

Follow this step-by-guide to set up an institutional repository and improve the visibility of your research output.

The steps involved in building an institutional repository

University of California eScholarship Repository
University of California's eScholarship Repository — one of the biggest institutional repositories out there. Source

Whether you are new to institutional repositories or an experienced hand, this guide features plenty of valuable tips and insights. It covers every step involved in great detail, from defining your content policies to what metrics you should track after setting up the repository.

1. ​​Craft an institutional repository policy

Well-defined policies go a long way in bringing clarity. It ensures that everyone is pulling in the same direction. Imagine setting up an IR without specifying the format in which content should be submitted or what happens when a faculty member leaves.

Avert the ensuing confusion by taking the time to lay down all the necessary rules and guidelines right from the get-go. Be sure to cover topics such as:

  • What type of content can be deposited?
  • The standard format for different types of content.
  • The formatting guidelines that ought to be followed by contributors.
  • What is the maximum storage space allocated to each contributor?
  • Who is eligible to deposit, and what can they deposit?
  • Who is responsible for copyright vetting?
  • What happens when a faculty member leaves?
  • How to handle grey literature and other unpublished materials?
  • How often should the content be refreshed?
  • What is the procedure for withdrawing/removing a paper from the repository?
  • The access levels — Who can read what? What can be repurposed?
  • What is the name of the repository?
  • The structure and hierarchy in which documents are stored and presented
  • What metadata should be submitted along with the resource?
  • The metadata convention that needs to be followed.
  • How to digitize documents that are only available in print?

This is not an exhaustive list, but it will help you get started. Break down the different aspects of an IR from content to the submission process and access privileges, and then specify rules and guidelines for each. Not only does it save time and effort and reduce confusion, but it will also help you choose the right IR software for your institution.

Before we move on, let’s take a look at Stanford University's Digital Repository. They have clearly outlined the repository's purpose and how different members of their community should use it.


2. Select a repository platform

open source vs proprietary infographic

To get your institutional repository up and running, you need a repository platform capable of hosting, managing, and showcasing your research output. There are different types of repository platforms out there — open-source platforms, proprietary platforms, and Electronic Theses and Dissertation (ETD) deposit systems.  

Since each institution has its own unique needs, demands, and requirements, you must choose a platform that fits your institution’s context. But there are certain key features and considerations that you should look for when evaluating these platforms. Use the list below to select a platform that works best for you.


Locally hosted platforms offer you more flexibility and control over your institutional repository. Plus, data transfer will be much faster. However, you will have to spend money and labor hours on supplementary infrastructure and maintenance of the local server. You will also need to take care of local upgrades and patching.

With cloud-hosted platforms, you will be able to access and store data remotely. Your institution does not have to worry about buying a server or maintaining one. Most of these platforms offer dedicated customer support and will take care of software patching and security. This way, managing it will be less of a hassle, but you will have limited control over the software and are likely to be more expensive.


Each repository platform offers a specific set of features. You need to identify features that will help your repository fulfill its overarching goals. For instance, if you want to gain more citations and visibility, you will need a search-engine-friendly platform that offers powerful SEO management functions.

Many of the platforms available in the market are outdated and only help in the preservation of the research output. Not only does it make your institutional repository less discoverable, but the workflows will also be more complicated and rigid, making the reader’s and the depositor’s experience less than satisfactory.

So, if you want your institutional repository to be advanced and comprehensive, you should look for a platform with advantages such as:

  • Seamless approval/rejection workflow
  • Ability to manage Open Access workflows effectively
  • Automated data harvesting from citation databases
  • Create and manage researcher profiles and publications
  • Offers integrations and interoperability with your existing tools and software
  • Supports and effectively renders all the different types of assets (and content formats) — preprint journal articles, electronic theses and dissertations, podcasts, video presentations, etc.
  • Powerful analytics and SEO tools
  • Ability to discover copyright information of all the publications

The cost involved and resources available

Open-source platforms are free, but they require more effort from a development and support perspective. A small team of librarians may not be able to manage and run the platform effectively. You will need to either hire skilled employees or train your existing staff in order to set up the repository platform and get it up and running.

On the other hand, most proprietary platforms are not free. However, they do come with dedicated customer support and tons of documentation. Your team will be able to get onboarded and commence work much faster.

So, before you make a decision, evaluate the budget and human resources that you have at your disposal.

3. Get supporting IT infrastructure and train relevant stakeholders

Once you’ve chosen the platform, the next step is to get all the additional software and hardware required to run the repository. That means you will need:

  • Computers that are capable of running the repository platform
  • Servers that can securely host and make the content accessible to readers within your institution and outside
  • Software or platforms that can convert formats such PDF and MS Word into JATS XML and other XML formats
  • SEO tools to monitor and analyze incoming organic traffic
  • Citation management solution to gather and organize citation database

Before you start crossing items off your to-buy list, you must evaluate if the platform, tools, or software aligns with your requirements or not. For instance, if you plan to allocate 10 GB of storage space to each researcher in your university, you need to make sure that you have enough space available in the first place.

Simultaneously, start training your librarians, IT technical associates, faculty, student liaisons, and other key stakeholders so that they know their roles and responsibilities clearly. Researchers must have a clear understanding of the submission process — where to submit, the type of content, stage of review, copyright regulations, expected format, approval cycle, among other things. On the other hand, an IT Technical Associate must be aware of what needs to be patched regularly, how often the data should be backed up, what to do in case there is unauthorized access, and more.

Ultimately, robust software and hardware are only effective when they are deployed and maintained correctly. When the repository and supporting ecosystem is functioning as intended, you are bound to see an uptick in adoption and usage.

4. Populate the repository with relevant content

IIT Madras populating its repository with pieces from before the repository was established

There are two ways to get started: digitize your print collection of theses and dissertations and encourage your faculty and researchers to use the repository to self-archive their existing scholarly work. In doing so, you will be able to populate the repository with relevant content quickly.

To kick off the digitization process, you need to first evaluate the present condition of the piece. If it is still legible, get in touch with its author, and seek their permission to digitize their work and make it available for open access through your institutional repository.

Once you get their permission, use high-end overhead book scanners to scan the document. Then, process it with Optical Character Recognition (OCR) software to turn it into an editable, searchable PDF. After that, assess the quality of the converted file and then deposit the PDF to the repository on behalf of the author.

The self-archiving process starts with speaking to library liaisons. Collaborate with them to devise a plan to get faculty and students to deposit their existing work to the repository. However, this doesn’t always work because the author may not have the time to do it themselves.

What you can do instead is to have a deposit-by-proxy model. Review the existing papers, Electronic Theses and Dissertations, and other pieces of content authored by your faculty and researchers. Find out about the copyright status and if it can be repurposed. Then, contact the author and seek their permission to deposit the piece to your repository on their behalf.

5. Promote usage and track the effectiveness

Royal College of Surgeons promoting their new repository on Twitter

Building a working repository is just the beginning. The moment you think of it as an if-you-build-it-they-will-come project, you are setting yourself up for failure. After the initial hype dies down, there will be a sharp decline in deposits or downloads. So, you must keep promoting the repository and reviewing key metrics regularly to see where you are at.

To promote usage of your institutional repository, you must:

  • Make the deposit process as seamless as possible.
  • Ensure the approval-rejection cycle is not too long.
  • Offer copyright detection technology to help authors identify and discover copyright information about their publications.
  • Accept multiple forms of grey literature.
  • Help authors understand how to make their work Open Access.
  • Make sure the repository pages and the content on them are optimized for Google Search and search engines.
  • Launch awards for most active contributors, most-read pieces, etc.
  • Encourage authors to share links to their repository content on social media and other online forums.
  • Promote the launch of the repository with press releases and special events.
  • Set up weekly email newsletters featuring repository content.

‘You can't manage what you can't measure’, this may be an age-old saying, but it is very much applicable to institutional repositories. Imagine reviewing the numbers and finding out that the downloads-to-views ratio of the Computer Science department is much higher than all the other departments, but new deposits are comparatively lower.

Getting such in-depth insight will enable you to make much stronger points when you meet the Head of the Department. This is why you must use Google Analytics, Google Search Console, Product Analytics tools, and other SEO tools to track metrics such as:

  • Number of downloads
  • Number of deposits
  • Number of publications indexed
  • Organic traffic
  • Social traffic
  • Search position changes
  • Backlinks
  • Citations


In the short run, building an Institutional Repository may help bring structure, comply with funding mandates, and ease of access to your research output. But it is the long-term benefits that make it a really compelling proposition — increase in visibility, becoming Open Access friendly, and raising the prestige level of the institution.

Now building, maintaining, and scaling a repository may sound overwhelming, but the good thing is that you don't have to do it all alone. Typeset's cloud-hosted Institutional Repository (and CMS) can make your transformational journey much more straightforward. Our best-in-class solution allows you to host, manage and showcase your research corpus seamlessly.

It comes with everything you need: integrated writing and publishing tools, copyright detection, streamlined deposit and approval workflows, SEO-optimized summaries, search-friendly indexing, robust reporting and analytics tools, and much more.