An overview of Android JetpackBackground

Announced with new additions at Google IO 2018, the Paging Library is a part of Android Jetpack, specifically in the Architecture category. The library efficiently manages data that is a part of a long list and is rooted in the principle of loading reasonably sized subsets, or "pages," of data into the model rather than a set that is too large. This functionality can improve resource usage and network bandwidth. The library includes components to manage how data is displayed and components to asynchronously update data from Data Sources. A PagedList can be easily implemented to new and existing projects.

Major Components

Before diving into the core of the library, we need to mention that the library is built with the latest architecture components; LiveData and Room. These will not be discussed in detail since they are worthy of their own discussions.

High-level architecture: Paging Architecture

PagedList and PagedListAdapter

PagedList is the key component of the library. This component allows a RecyclerView to load a chunk of data from a DataSource. A PagedList will trigger Boundary Callback when a view reaches the end of PagedList.

val config = PagedList.Config.Builder()
 // optional, default: true
val data = LivePagedListBuilder(dataSourceFactory, config)
 // Required for DataSource when out of data

alternative for above:

val data = LivePagedListBuilder(dataSourceFactory, LOADING_PAGE_SIZE)
 // Required for DataSource when out of data

The PageListAdapter binds PagedList to a RecyclerView. This class is required to implement a comparator for RecyclerView and DataSource.

Creation of a PagedList: Creating a Paged List


DataSource holds the content from external sources such as network APIs and serves as a local storage for PagedLists. A PagedList is generated by PagedListBuilder with DataSourceFactory passed into it, optionally with configuration parameters. The PagedList will contain a duplication of the DataSource. This process limits a PagedList to hold only a snapshot of data but allows multiple PagedLists from a single DataSource. When PagedList triggers an "out of data" request out to the network, DataSource will listen to a response from the service and emit an invalidate signal for each PagedList to update as necessary.

There are different DataSources based on the service requirements.

  • PageKeyedDataSource: When a request requires next/previous index keys.
  • ItemKeyedDataSource: When a request requires an item as a key.
  • PositionalDataSource: When a request requires an index to fetch next batch.

You can also create a custom DataSource when necessary.

class MyDataSource : ItemKeyedDataSource() {
 override fun getKey(item: Item) =
 ovrride fun loadInitial(params: LoadInitialParams, callback: LoadInitialCallback) {
 val items = fetchItems(params.requestedLoadSize)
 override fun loadAfter(params: LoadParams, callback: LoadCallback) {
 val items = fetchItemsAfter(start = params.key, limit = params.requestedLoadSize)

Boundary Callback

A Boundary Callback is the handler when PagedList reaches the "out of data" condition. This new class resolves a requirement for maintaining visible lists and loaded data by developers. When a PagedList displays the last item in its DataSource instance, it'll emit a trigger to run onItemAtEndLoaded() method.

class BoundaryCallback(val service: ServiceAPI, val storage: LocalStorage)
 : PagedList.BoundaryCallback() {
 * Database returned 0 items. We should query the backend for more items.
 override fun onZeroItemsLoaded() {
 // fetch data from service
 * When all items in the database were loaded, we need to query the backend for more items.
 override fun onItemAtEndLoaded(itemAtEnd: Item) {
 // fetch more data from 

Example of a Boundary Callback: Boundary Callback


Placeholders can be important to the visual benefits of paging. By default, placeholders are set to true, but can be set to false in the config - PagedList.config.Builder(). This is done using .setEnablePlaceholders(true). When set to false, the user loads a page and the scrollbar is displayed based on that specific page size. Once the next page loads, the scrollbar "jumps" to the middle and the user can start scrolling down again until the next page loads and the process repeats.

By using placeholders, the PagedList will communicate the number of list items to the PagedListAdapter, but the data is still not loaded to preserve resources and maintain efficiency. They are saved as null. When the data is eventually loaded they will display normally.

The benefit of this is that the user now gets infinite, 'non-jumpy', scrolling without loading spinners. Since the view knows the list size, the scroll bar size is more accurate as well. However, keep in mind that the adapter should handle null items and the items should all be the same size. Handling null items can be done by setting default values in the ViewHolder where data is bound to an item. Also, the data source must be able to count the items, which is a feature bundled with libraries like Room.

Below is an example of how infinite scrolling works with Paging and Placeholders set to true. As you can see, even though the page size is smaller than the entire list the scroll bar size indicates the length of the list and the user can smoothly scroll as pages are loaded.

Initial Load Size
The default page size is set in the config. However, developers may find it useful to load a larger page for the initial load so that the PagedList is not requesting more data immediately. This can be done using .setInitialLoadSizeHint(int) also in the config and can make scrolling the list more seamless.

PagedList by default prefetches data before it is loaded in from the Data Source equal to the page size that is set. However, the user can set this property to a larger or smaller value in the config as well using .setPrefetchDistance(int). By setting this value, the data is preloaded by a certain size before it is called into the view. This may be beneficial to speed up loading times of data if items are ready to be loaded beforehand or to improve resource allocation and efficiency if a large prefetch is not needed.


Here are the dependencies used in the source code and these are common to many PagedList applications:

// architecture component 
implementation "android.arch.lifecycle:extensions $archComponentsVersion"
implementation "android.arch.lifecycle:runtime:$archComponentsVersion"
kapt "android.arch.lifecycle:compiler:$archComponentsVersion"

// room database support
implementation "$roomVersion"
kapt "$roomVersion"

// paging support
implementation "android.arch.paging:runtime:$pagingVersion"

What Now?

You may be wondering what the real-world value is behind Paging. The truth is a lot of PagedList's benefits are behind the scenes in making the application more efficient. By managing portions of data rather than entire sets, one can see the benefit in conserving resources for when they're needed. With the announcement of Paging in 2017 and the additions in 2018, developers can easily refactor older applications using List to PagedList.

By leveraging this library with new applications, developers can streamline the amount of time it takes to create an efficient PagedList project. The Paging library makes creating smooth, infinite-scrolling lists much easier. Also, it makes updating list differences asynchronously more intuitive. When connected to a network, the Boundary Callback class facilitates the connection and calls for more data between PagedList, Data Source, and the network. In the past, most of this logic would be non-trivial and created by the development team but now it is packaged and ready to go in the library.

Use Cases

Multiple Views
Utilizing Paging can be useful when there are multiple activities in the application that contain PagedLists. When the user views the first list, the initial page is loaded, rather than the entire dataset. Then, the user can switch to another PagedList view and see another page that's loaded onto that. However, in either instance, the entire list was not loaded. When the user scrolls or moves to another list and scrolls, only then are the subsequent pages loaded. This can add another layer of efficiency to applications that display lots of data. In the background, the PagedLists, Data Sources, and Boundary Callbacks can manage when data is needed and when there are differences.

Retail Applications
Retail applications can benefit from the use of this library, as well. Commonly, retail applications contain and display large quantities of product to the user. By using placeholders, these applications can display data using infinite scrolling, which could make the UI more user-friendly and appealing. Also, due to the quantity of data, loading shorter pages at a time would improve performance. Developers could leverage Boundary Callbacks to make network calls for more data and prefetch X items to have the data ready to go. Retail applications usually have robust filtering/sorting so often the data being displayed on the page changes as filters are added or removed. The Paging libraries asynchronous calls to data can spot differences and replace individual invalidations in the data list due to the Data Factory's creation of a new Data Source. Developer's now will have an easier time setting this up with the library available, rather than creating the logic from scratch to do so.

Financial Applications
The paging library can be leveraged with financial applications when an enormous amount of data is displayed. For example, a user's account transactions, statements, and credit card history can be a long list. Typically an application requires a user to limit the transaction inquiry to a given period of time. The PagedList can provide optional assistance to load and to display pages. Without user selecting a range of transaction period, an application can continuously fetch transactions as the user scrolls to next/previous pages using an appropriate DataSource