Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

List all pages in a space showing titles of page and IDs

Vikas Shrivastava
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
July 29, 2018

Does any please help me to get the list of all pages showing page titles and page IDs within a Confluence space using bash / python script ?

I want to generate a list of all pages showing the page title and page ID.

Thanks in advance

Vikas

2 answers

2 votes
Zak Laughton
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
July 31, 2018

Hi Vikas!

There are a few ways to accomplish this:

REST API

The api will likely be the best way to retrieve data from Confluence in a bash or Confluence script. You can see Confluence REST API Examples for examples of terminal and python commands for using the API.

The following URL will return a JSON list of all pages in the instance (replace <base-URL> with the base URL for your instance):

http://<base-URL>/rest/api/content?type=page&start=0&limit=99999

You can then use python to parse through the JSON to find the ID and title of each page (useful article on JSON parsing with Python: Working with JSON data in Python).

Database

While the REST API would be most convenient to use with a Python/bash script, you can also get all the page titles and ID's from the database with the following query:

SELECT title, contentid
FROM content
WHERE contenttype = 'PAGE'
AND prevver IS NULL
AND content_status = 'current';

I hope this helps!
-Zak

0 votes
antony terrence
Contributor
June 1, 2021

@Zak Laughton When I use the following, I get only 200 results. Is that set by the Confluence Server admin?

http://<base-URL>/rest/api/content?type=page&start=0&limit=99999
Zhiwei Deng
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 1, 2021

It is works to me for this url.

https://<base-URL>/wiki/rest/api/space/{SPACE_KEY}/content?start=0&limit=9999&type=page

But there were still some problems.

1. the result still exist limit. the limit is 1000

2. I add a new param: expand=children.page. the limit param is no effective. (In fact. the limit is return to 200...)

@Vikas Shrivastava @antony terrence 

Like • Deleted user likes this
antony terrence
Contributor
June 1, 2021 edited

I had to get the first set of results and do a loop based on the presence of the next link in the response.  When I set the limit to 99999, and I get maximum of 500. If we have to perform a simple action of getting all page details, we have to make multiple calls. I am sure there are areas where Atlassian could reduce the number of calls required to be made. This scenario is one of them.  The depth parameter does not work. 

Like • Deleted user likes this
Zhiwei Deng
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 3, 2021

Yes. Finally, I made multiple calls to get all pages. But I found another problem. There were exist limit in the "children" field when I add the param: expand=children.page.

(The limit is 25). So that I can't generate the tree structure. This is confusing

https://<base-URL>/wiki/rest/api/space/{SPACE_KEY}/content?expand=children.page&type=page&limit=9999
Pankaj Rana May 23, 2023

I am seeing 404 not found on using above API call.

Admin October 2, 2023

Hi Guys, I used this script for listing all pages from specific space via API:

$url = "https://$($serverUrl)/rest/api/space/$($SpaceKey)/content/page?limit=99999"

$response = Invoke-RestMethod -Method "GET" -Headers $headers -Uri $url -UseBasicParsing

$allSpacePages = $response.results

do {

$url = "https://$($serverUrl)$($response._links.next)"

$response = Invoke-RestMethod -Method "GET" -Headers $headers -Uri $url -UseBasicParsing

$allSpacePages += $response.results


} while($response._links.next -ne $null)

 



This is really goes thru (i tested via POSTMAN step by step) all "_links.next" until this object is null and returns me about 6500 pages from space, but...
when I listed  all pages from space via SQL query:

 

 

SELECT * FROM [Cfl-Db].[dbo].[CONTENT]

WHERE CONTENTTYPE = 'PAGE' AND SPACEID = 51118093

ORDER BY TITLE

 


!!! I got twice more pages about 12 000 !!!


So question is why the api call didn't list all existing pages?

I use Datacenter version 
7.18.3 


Thank you for your answers :)



Admin October 3, 2023


Guys my fault :( I realized that DB returns all page types like "drafts, deleted or current" pages. 
But anyway from DB I got more pages then from API.

Fixed query:

SELECT * FROM [$db].[dbo].[CONTENT]
WHERE CONTENTTYPE = 'PAGE' AND CONTENT_STATUS = 'current' AND SPACEID = $spaceId
ORDER BY TITLE

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events