ChatGPT解决这个技术问题 Extra ChatGPT

Pagination in a REST web application

This is a more generic reformulation of this question (with the elimination of the Rails specific parts)

I am not sure how to implement pagination on a resource in a RESTful web application. Assuming that I have a resource called products, which of the following do you think is the best approach, and why:

1. Using only query strings

eg. http://application/products?page=2&sort_by=date&sort_how=asc
The problem here is that I can't use full page caching and also the URL is not very clean and easy to remember.

2. Using pages as resources and query strings for sorting

eg. http://application/products/page/2?sort_by=date&sort_how=asc
In this case, the problem that is see is that http://application/products/pages/1 is not a unique resource since using sort_by=price can yield a totally different result and I still can't use page caching.

3. Using pages as resources and an URL segment for sorting

eg. http://application/products/by-date/page/2
I personally see no problem in using this method, but someone warned me that this is not a good way to go (he didn't give a reason, so if you know why it's not recommended, please let me know)

Any suggestions, opinions, critiques are more than welcome. Thanks.

This is a great question.
Bonus question: how do people usually specify page sizes?
Don't forget about Matrix parameters w3.org/DesignIssues/MatrixURIs.html

V
Vinod

I agree with Fionn, but I'll go one step further and say that to me the Page is not a resource, it's a property of the request. That makes me chose option 1 query string only. It just feels right. I really like how the Twitter API is structured restfully. Not too simple, not too complicated, well documented. For better or worse it's my "go to" design when I am on the fence on doing something one way versus another.


+1: query strings are not first-class resource identifiers; they just clarification for ordering and grouping of the resource.
@S.Lott The request is the resource. What you call "first-class resources" are defined as values by Fielding in section 5.2.1.1 of his dissertation. Furthermore, in the same section, Fielding gives the Latest Revision of a source code file as an example of a resource. How can that be a resource but The latest 10 products be "properties of the request on the products resource"? I understand that your view is more practical, but I think that it is less RESTful.
Note that my comment does not mean that I disagree with the choice of using query strings over URLs: both are viable solutions as long as the API is hypermedia-driven, as @RichApodaca has mentioned in his answer. I am just pointing out that the Page should be considered as a resource from a REST point of view.
B
Ben

I think the problem with version 3 is more a "point of view" problem - do you see the page as the resource or the products on the page.

If you see the page as the resource it is a perfectly fine solution, since the query for page 2 will always yield page 2.

But if you see the products on the page as the resource you have the problem that the products on page 2 might change (old products deleted, or whatever), in this case the URI is not always returning the same resource(s).

E.g. A customer stores a link to the product list page X, next time the link is opened the product in question might no longer be on page X.


Well but if you delete something there shouldn't be something else on the same URI. If you delete all products of page X - page X may still be valid but contains now the products from page X + 1. So the URI for page X has become the URI for page X + 1 if you see it in "product resource view".
> If you see the page as the resource it is a perfectly well solution, since the query for page 2 will allways yield page 2. Does it even make sense? Same URL (any URL mentioning page 2) will always yield page 2 no matter what you as resource.
Seeing page as resource probably should introduce POST /foo/page to create a new page, right?
Your answer smoothly goes to "correct solution is 1", but doesn't state it.
In my mind, page is a floating concept, and not related to the underlying domain. And therefore should not be considered as a resource. I mean floating in the sense that it is fluid, that the concept of page changes with the context; one user of your API may be a mobile app, that can consume only 2 products per page, while the other is a machine app that can consume the whole damn list. In short, page is a "representation" of the underlying domain entity (product) and should not be included as a part of the URL; only as a query parameter.
t
temoto

HTTP has great Range header which is suitable for pagination too. You may send

Range: pages=1

to have only first page. That may force you to rethink what is a page. Maybe client wants a different range of items. Range header also works to declare an order:

Range: products-by-date=2009_03_27-

to get all products newer than that date or

Range: products-by-date=0-2009_11_30

to get all products older than that date. '0' is probably not best solution, but RFC seems to want something for range start. There may be HTTP parsers deployed which wouldn't parse units=-range_end.

If headers is not an (acceptable) option, i reckon first solution (all in query string) is a way to deal with pages. But please, normalize query strings (sort (key=value) pairs in alphabet order). This solves "?a=1&b=x" and "?b=x&a=1" differentiation problem.


headers might look nice at first glance, but they disallow sharing the page (e.g. by copying the url). So for ajax request they might be a nice solution (since pages modified by ajax cannot be shared in their current state anyway), but I would not use them for regular pagination.
And the Range header is only for byte ranges. See [the HTTP headers spec](w3.org/Protocols/rfc2616/rfc2616-sec14.html ), section 14.35.
@ChrisWestin w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.12 HTTP/1.1 uses range units in the Range (section 14.35) and Content-Range (section 14.16) header fields. range-unit = bytes-unit | other-range-unit Maybe you are referring to The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1 implementations MAY ignore ranges specified using other units. That is not the same as your statement.
@Markus I can't imagine the use case when you are sharing rest api resource :)
@JakubKnejzlik Sharing is not an issue, but using HTTP headers for paging prevents using HATEOAS links for paging.
R
Rich Apodaca

Option 1 seems the best, to the extent that your application views pagination as a technique for producing a different view of the same resource.

Having said that, the URL scheme is relatively insignificant. If you are designing your application to be hypertext-driven (as all REST applications must be by definition), then your client will not be constructing any URIs on its own. Instead, your application will be giving the links to the client and the client will follow them.

One kind of link your client can provide is a pagination link.

The pleasant side-effect of all of this is that even if you change your mind about pagination URI structure and implement something totally different next week, your clients can continue working without any modification whatsoever.


Nice reminder on using hypermedia like links in REST web services.
J
John Snyders

I have always used the style of option 1. Caching has not been a concern since the data changes frequently anyway in my case. If you allow the size of the page to be configurable then again the data can't be cached.

I don't find the url hard to remember or unclean. To me this is a fine use of query parameters. The resource is clearly a list of products and the query params are just telling how you want the list displayed - sorted and which page.


+1 I think you are right and I'll go with the query parameters (option 1)
"I don't find the URL hard to remember". This observation is useless in REST applications, as those should typically have only one single bookmark... If a user (or a client app) tries to "remember" the URL, this is a good sign that the API is not restful.
T
TEHEK

Strange that nobody has pointed out that Option 3 has parameters in a specific order. http//application/products/Date/Descending/Name/Ascending/page/2 and http//application/products/Name/Ascending/Date/Descending/page/2

are pointing to the same resource, but have completely different urls.

For me Option 1 seems the most acceptable, since it clearly separates "What I want" and "How I want" it (It even has question mark between them lol). Full-page caching can be implemented using full URL (All options will suffer of the same problem anyway).

With Parameters-in-URL approach the only benefit is clean URL. Though you have to come up with some way to encode parameters and losslessly decode them. Of course you can go with URLencode/decode, but it will make urls ugly again :)


Those are two different orderings. The first sorts by date descending, and only breaks ties by name ascending; the second sorts by name ascending, and only breaks ties by date descending.
In fact the two example URLs given here are not only different by writing, but by meaning also. Since denoting a path, no guarantee is given that you find the same thing when turning left first and right afterwards or vice versa. Having said this, sort parameters as URL path parts have formal advantages over URL parameters which should be commutatively exchangeable without changing the overall meaning, but indeed suffer from encoding traps as is said here.
M
Mario Arturo

Looking for best practices I came across this site:

http://www.restapitutorial.com

In the resources page there is a link to download a .pdf that contains the complete REST best practices suggested by the author. In which among other things there is a section about pagination.

The author suggest to add support to both using a Range header and using query-string parameters.

Request

HTTP header example:

Range: items=0-24

Query-string parameters example:

GET http://api.example.com/resources?offset=0&limit=25

Where offset is the beginning item number and limit is the maximum number of items to return.

Response

The response should include a Content-Range header indicating how many items are being returned and how many total items exist yet to be retrieved

HTTP header examples:

Content-Range: items 0-24/66

Content-Range: items 40-65/*

In the .pdf there are some other suggestions for more specific cases.


S
Sorter

I'd prefer using query parameters offset and limit.

offset: for index of the item in the collection.

limit: for count of items.

The client can simply keep updating the offset as follows

offset = offset + limit

for the next page.

The path is considered the resource identifier. And a page is not a resource but a subset of the resource collection. Since pagination is generally a GET request, query parameters are best suited for pagination rather than headers.

Reference: https://metamug.com/article/rest-api-developers-dilemma.html#Requesting-the-next-page


S
Steve Willcock

I'm currently using a scheme similar to this in my ASP.NET MVC apps:

e.g. http://application/products/by-date/page/2

specifically it's : http://application/products/Date/Ascending/3

However, I'm not really happy with including paging and sorting information in the route in this way.

The list of items (products in this case) is mutable. i.e. the next time someone returns to a url that includes paging and sorting parameters, the results they get may have changed. So the idea of http://application/products/Date/Ascending/3 as a unique url that points to a defined, unchanging set of products is lost.


The first issue, with sorting on multiple columns, applies to all the 3 methods in my opinion. So it isn't really a pro/con for any of them. Regarding the second issue: can't that happend to any resource? A product, for example, can also be edited/deleted.
I think sorting on multiple columns is really a 'con' for all 3 methods as the url just gets bigger and more unmanageable - hence one reason I am considering moving to form based page / sort parameters. For the second issue, I think there's a fundamental conceptual difference between a unique persistent identifier like a product id than a transient list of products. For deleted products a message e.g. 'That product does not exist in the system' tells you something concrete about that product.
Removing all the paging and sorting information from the route is good. And pushing it into POST parameters is bad. Hello? Question is about REST. We're not using POST just to make URL shorter in REST. Verb makes sense.
Personally, I wouldn't use form parameters for a query because it would almost require a POST or PUT HTTP method (since there is a body now in the request). GET seems to me like the more appropriate method to use since both POST and PUT imply modifying the resource. Due to that I would go with adding more query parameters to the URL when sorting by multiple columns is needed.
i
insane.dreamer

I tend to agree with slf that "page" is not really a resource. On the other hand, option 3 is cleaner, easier to read, and can be more easily guessed by the user and even typed out if necessary. I'm torn between options 1 and 3, but don't see any reason not to use option 3.

Also, while they look nice, one downside of using hidden parameters, as someone mentioned, rather than query strings or URL segments is that the user can't bookmark or directly link to a particular page. That may or may not be an issue depending on the application, but just something to be aware of.


Regarding your mention of being easier to guess, this should not matter. If building a hypermedia API, the users should never HAVE to guess URIs.
A
Alex

I've used solution 3 before (I write a LOT of django apps). And I don't think that there is anything wrong with it. It's just as generatable as the other two (incase you need to do some mass scraping or the like) and it looks cleaner. Plus, your users can guess urls (if its a public facing app), and people like being able to go directly where they want, and url-guessing feels empowering.


E
Eugene

I use in my projects the following urls:

http://application/products?page=2&sort=+field1-field2

which means - "give me page the second page ordered ascending by field1 and then descending by field2". Or if I need even more flexibility I use:

http://application/products?skip=20&limit=20&sort=+field1-field2

S
Susanta Ghosh

I use in following patterns to get the next page record. http://application/products?lastRecordKey=?&pageSize=20&sort=ASC

RecordKey is the column of a table which holds sequential value in DB. This is used to fetch only one page data at a time from DB. pageSize is used to determine how many record to fetch. sort is used to sort the record in ascending or descending order.