Software Heritage API Client
This is a PHP API client/connector for Software Heritage (SWH) web API - currently in Beta phase. The client is wrapped round the Illuminate Http package
and the GuzzleHTTP
library.
[!Note] Detailed documentation can be found in the wiki pages of this very repository.
A demonstrable version (some features) can be accessed here: Demo Version
Working on new features and fixes will be gladly considered. Please feel free to report.
1) Clone this project.
2) Open a console session and navigate to the cloned directory:
Run "composer install"
This should involve installing the PHP REPL, PsySH
3) (Optional) Acquire SWH tokens for increased SWH-API Rate-Limits.
4) Prepare .env file and add tokens:
4.1) Rename/Copy the cloned ".env.example" file to .env
cp .env.example .env
4.2) (Optional) Edit these two token keys:
SWH_TOKEN_PROD=Your_TOKEN_FROM_SWH_ACCOUNT # step 3)
SWH_TOKEN_STAGING=Your_STAGING_TOKEN_FROM_SWH_ACCOUNT # step 3)
5) (optional) Add psysh to PATH.
In a console session inside the cloned directory, start the php REPL:
$ psysh // if not added to PATH replace with: vendor/bin/psysh
Psy Shell v0.12.0 (PHP 8.2.0 — cli) by Justin Hileman
This will open a REPL console-based session where one can test the functionality of the api classes and their methods before building a suitable workflow/use-cases.
As a one-time configuration parameter, you can set the desired returned data type by SWH (default JSON):
> namespace Module\HTTPConnector;
> use Module\HTTPConnector;
> HTTPClient::setOptions(responseType:'object') // json/collect/object available
- More details on the default configs: Default Configurations
- More details on further options set: Preset Configurations.
Retrieve Latest Full Visit in the SWH archive:
> namespace Module\OriginVisits;
> use Module\OriginVisits;
> $visitObject = new SwhVisits('https://github.com/torvalds/linux/');
> $visitObject->getVisit('latest', requireSnapshot: true)
More details on further swh visits methods: SwhVisits.
As graph Nodes, retrieve node Contents, Edges or find a Path to other nodes (top-bottom):
> namespace Module\DAGModel;
> use Module\DAGModel;
> $snpNode = new GraphNode('swh:1:snp:bcfd516ef0e188d20056c77b8577577ac3ca6e58')
> $snpNode->nodeHopp() // node contents
> $snpNode->nodeEdges() // node edges keyed by the respective name
> $revNode = new GraphNode('swh:1:rev:9cf5bf02b583b93aa0d149cac1aa06ee4a4f655c')
> $revNode->nodeTraversal('deps/nghttp2/lib/includes/nghttp2/nghttp2ver.h.in') // traverse to a deeply nested file
More details on:
- General Node Methods.
- The Graph methods:
You can specify repositories URL w/o paths and archive to SWH using one of the two variants (static/non-static methods
):
> namespace Module\Archival;
> use Module\Archival;
> $saveRequest = new Archive('https://github.com/torvalds/linux/') // Example 1
> $saveRequest->save2Swh()
> $newSaveRequest = Archive::repository('https://github.com/hylang/hy/tree/stable/hy/core') // Example 2
// in both cases: the returned POST response contains the save request id and date
Enquire about archival status using the id/date of the archival request (available in the initial POST response)
> $saveRequest->getArchivalStatus($saveRequestDateOrID) // current status is returned
> $saveRequest->trackArchivalStatus($saveRequestDateOrID) // tracks until archival has succeeded
More details on further archive methods: Archive.
Validate a given swhID. TypeError
is thrown for non-valid swhIDs.
> namespace Module\DataType;
> use Module\DataType;
$snpID = new SwhcoreId('swh:1:snp:bcfd516ef0e188d20056c77b8577577ac3ca6e5Z') // throws TypeError Exception
Full details of the SWHID persistent Identifiers: Syntax
[!Note] Todo: Core identifiers with qualifiers.
Returns a list of metadata authorities that provided metadata on the given target
> namespace Module\MetaData;
> use Module\MetaData;
> SwhMetaData::getOriginMetaData('https://github.com/torvalds/linux/')
More details on further metadata methods: Metadata.