HackTheHearst API Detail
This page includes information about the API to the Phoebe A. Hearst Museum of Anthropology's digital collections data. Use of the API implies agreement with its Terms of Service and Disclaimer, given on HackTheHearst's main API page (parent to this page).
The API is backed by an Apache Solr instance; the full Solr API is not described on this page. For example, faceted search is not described here, but is described in the Solr documentation. To learn about ways to query the Solr API not covered on this page, consult the Apache Solr Reference Guide (also available as a PDF).
Note that authorization credentials (the app_id and app_key parameters in API request URLs) must be submitted as URL parameters in order to access the API. App IDs and Keys will be distributed to teams at the HackTheHearst kickoff. Interactive documentation—a web page in which parameters to the API may be entered, and response documents returned when a request with those parameters is sent to the API—will be linked from this page by the time the HackTheHearst kickoff begins.
- The Hearst Museum digital collections data API is backed by an instance of Apache Solr. The Apache Solr Reference Guide (v4.8) is available in PDF form from multiple mirror sites. The section on Specifying Terms for the Standard Query Parser may be of particular interest to Hackathon participants, as it gives a much more complete picture of how queries in requests to this API can be formulated.
Fields retrievable via the Hearst Museum Solr API
The fields given in the list below are metadata to items in the Hearst Museum digital collections data. The primary objects they reference are physical objects in the Hearst Museum's collections. Images of, or associated with, these primary objects may be accessed via the URLs given in the blob_ss field.
The first table shown is a quick overview of the fields, and the second table covers the same fields in greater detail.
|Field Name||Brief descriptions|
|csid_s||Primary key for object|
|objmusno_s||The number assigned to a specific cataloged object|
|objsortnum_s||A sortable version of above|
|objaltnum_ss||Other numbers that were previously assigned to the object|
|objtype_s||The general type of object (e.g., archaeology versus ethnography)|
|objname_s||A name for the cataloged object|
|objcount_s||The piece count for the cataloged object|
|objcountnote_s||Notes regarding the piece count|
|objtitle_s||A title for the cataloged object|
|objdescr_s||A description of the cataloged object|
|objdept_s||The primary "catalog" to which the cataloged object belongs|
|objfcp_s||The field collection place of the object|
|objfcpverbatim_s||A verbatim statement of provenience|
|objfcpgeoloc_s||The latitude and longitude of the field collection place|
|objfcpelevation_s||The elevation of the field collection place|
|objfcptree_ss||The higher-order geographic place names for the field collection place|
|objassoccult_s||The cultural group(s) associated with the object in its original context|
|objculturetree_ss||The higher-order cultural group names for the associated cultural group(s)|
|objkeelingser_s||Used for audio recordings only: The Keeling series number|
|objinventory_s||The name of the relevant NAGPRA inventory
|objfilecode_ss||An indication of the original general use of the cataloged object|
|objcontextuse_s||The context in which the object was originally used|
|objdimensions_ss||Measurements of the cataloged object|
|objmaterials_ss||Materials from which the cataloged object was made|
|objinscrtext_ss||Textual inscriptions placed found on the cataloged object|
|objcomment_s||Additional, relevant information about an object|
|objcollector_ss||The name(s) of the person(s) who collected the cataloged object|
|objcolldate_s||The date that the object was collected from the field (as text)|
|objcolldate_begin_dt||The earliest possible collection date (as ISO-8601 date)|
|objcolldate_end_dt||The latest possible collection date (as ISO-8601 date)|
|objproddate_s||The date that the object was produced (as text)|
|objproddate_begin_dt||The earliest possible production date (as ISO-8601 date)|
|objproddate_end_dt||The latest possible production date (as ISO-8601 date)|
|objacqdate_ss||The date that the object was acquired by the Museum (as text)|
|objacqdate_begin_dt||The earliest possible acquisition date (as ISO-8601 date)|
|objacqdate_end_dt||The latest possible acquisition date (as ISO-8601 date)|
|objaccdate_ss||The date that the object was accessioned into the Museum (as text)|
|objaccdate_begin_dt||The earliest possible accession date (as ISO-8601 date)|
|objaccdate_end_dt||The latest possible accession date (as ISO-8601 date)|
|objaccno_ss||The accession number(s) for the cataloged object|
|blob_ss||The GUID identifier of the associated media objects|
The following table provides additional descriptive detail for each of these fields:
|Field Name||Fuller descriptions|
|csid_s||A unique internal identifier for the catalog record. The object csid is in the form of a GUID.
|objmusno_s||The identifying number of the catalog record, called a Museum Number.
Examples: "11-37781", "K-3720a,b" or "2013.01.0001"
Note: Until 2012, Museum numbers were created and assigned following this pattern:
[catalog designation]-[object number][optional suffix]
For example, "18-1339a,b" or "18-319g"
The catalog designations are generally 1- or 2-digit numbers or a single letter.
The object numbers are sequentially assigned integers
The suffixes are optional and are used to specify sub-objects within a catalog record.
Since 2012, a more conventional system of numbering has been adopted, following this pattern:
[4-digit year].[2-digit accession number].[4-digit object number]
The accession number resets to 01 with each new year.
The object number resets to 0001 with each new accession.
|objsortnum_s||A formatted string that, when used for sorting, returns objects in their correct order (i.e., "2-30" comes after "2-4" and before "2-1000"). Used for sorting only.
Examples: "000011 022681 aj", "000008 000511 m"
|objaltnum_ss||An alternate number for an object. This field can have multiple values.. Each value is a concatenation of the alternate number, the type of alternate number (in parentheses), and an optional note about the alternate number (which is also set inside the parentheses, when present).
Example: "1 (song number)"
|objtype_s||A short controlled string indicating the general type of object represented by the catalog record.
Examples: "(not specified)", "none (Registration)", "archaeology", "ethnography", "documentation", "sample", "indeterminate", "unknown".
|objname_s||An uncontrolled string that provides a short name for the object.
Example: "model canoe"
|objcount_s||The number of objects or pieces that comprise the catalog record. For instance, a quiver with four arrows might be cataloged together as a single catalog record and given a single Museum Number, but the object count will be set to 5 (1 quiver + 4 arrows).|
|objcountnote_s||An uncontrolled text field which can contain additional information about the count.
Examples: "weighed but not counted", "13 bags of fragments"
|objtitle_s||A concatenation of the title assigned to an object and the type of title it is.
Examples: "Petroglyph, People (Title Subject)", "Tallac House, Lake Tahoe. :1009 (Artist's Label)"
|objdescr_s||A fuller, but still terse, description of the object. The description may include information from other fields, such as materials, dimensions, and dates.
Example: "Medal; AE (copper, brass, or bronze) and gold; obverse: Eagle with Snake in beak, cactus; surrounded by wreath, flags, symbols of Spanish-Mexico; reverse: scene of place of battle, cannon and flags; surrounded by inscription; 1829; 1 1/2 x 1 15/16 inch oval. Recatalogued to 3-16176."
|objdept_s||A controlled field which contains a brief description of the primary curatorial area of an object. Now obsolete, but still helpful in many cases; for example, when an object has poor metadata.
Examples: "Cat. 11 - Oceania (incl. Australia)", "Cat. 6 - Ancient Egypt (the Hearst Reisner Egyptian Collection)"
|objfcp_s||The field collection place of the object. For archaeological objects, this is usually the archaeological site; for ethnographic objects, this is usually the place where the object was purchased or otherwise obtained.
Examples: "Santa Rosa, Sonoma county, California", "San Pedro de los Conchos, Chihuahua state, Mexico"
|objfcpverbatim_s||The verbatim text entered in the "provenience" field on the original paper catalog cards. This field often contains outdated place names (e.g., Northern Rhodesia), and will often contain information about the associated cultural group and/or the name of the maker.|
|objfcpgeoloc_s||The decimal latitude and longitude of the field collection place, presented in the form of latitude, longitude.
Example: "-8.11243825276631725, -79.0745471790682473"
|objfcpelevation_s||An expression of the elevation (in meters above sea level) of the field collection place. Note: this field is a text field, and can contain values in different formats.
Examples: "81", "1264 m", or "1000 m (avg)".
|objfcptree_ss||A field that contains the display names of the field collection place and all of the parent places of the field collection place, up to the level of continent. This field can have multiple values.
Example (all are values for the same object): "Oceania", "Micronesia, Oceania", "Caroline Islands, Micronesia, Oceania", "Federated States of Micronesia, Micronesia, Oceania", "Yap State, Federated States of Micronesia, Micronesia", "Yap Island, Yap State, Federated States of Micronesia", "Weloy municipality, Yap Island, Yap State", "Damkil site, Weloy municipality, Yap Island"
|objassoccult_s||A controlled string expression that indicates the cultural group(s) associated with the object in its original context.
Examples: "Pomo", "Yurok"
|objculturetree_ss||A field that contains the display names of the associated cultural group and of all the parent culture groups of the associated cultural group. This field can have multiple values, and may contain values frommultiple hierarchies if there are multiple associated cultures. An "@" symbol indicates that the associated term is a guide term.
Example (all are values for the same object): "@The Americas", "@North America", "@North American native cultures", "@California tribes", "@Northern California tribes", "@Northwestern California tribes", "Klamath River Tribes", "Yurok"
|objkeelingser_s||(Used for audio recordings only). The number and name of the series of recordings to which this object belongs. These series were described by Richard Keeling in his 1990 book, A Guide to Early Field Recordsings (1900–1949) at the Lowie Museum of Anthropology.|
|objinventory_s||A controlled field indicating the name of the submitted document ("inventory") on which the object was reported to the National Park Service to comply with NAGPRA (the Native American Graves Protection and Repatriation Act). The Museum has submitted 208 NAGPRA inventories, and the complete list can be viewed here.
Examples: "caSanFranciscoCounty1", "caChumash2", "caCco138"
|objfilecode_ss||An indication of the classification of ethnographic objects according to criteria of use and function. The values in this field begin with a code in the form of "#.#". The first digit of this code indicates the general classification of the object:
Example: "2.3 Special ornaments, garb, and finery worn to battle by warriors (excluding status insignia)".
|objcontextuse_s||The context in which the object was originally used.
Example: "worn as ornamentation in hair during jump dance."
|objdimensions_ss||A concatenation of up to four fields of information:
Examples: "height 10.5 centimeters", "duration 123 seconds"
|objmaterials_ss||A concatenation of up to three fields of information:
Examples: "Obsidian", "Cotton", "Ivory", "Ceramic|Shell"
|objinscrtext_ss||A description and transcription of all textual inscriptions placed on the object.
Example: "Obverse legend: IMP GORDIANVS PIVS FEL AVG|Reverse legend: P M T R P IIII COS III P P |Reverse exergue: SC"
|objcomment_s||Additional, relevant information about an object, but that is out of scope for inclusion in the object description.
Example: "Photo: 15-8656. Published: Ill. C.D. Forde, UCPAAE 28:4, Pl. 52. Remarks: \"originally included 1-27074a--5 gourds--They have been given a new # 1-234129.\"" (NB: the included quotation marks have been escaped in this example)
|objcollector_ss||The name(s) of the person(s) who collected this object, often an archaeologist or ethnographer. This field can have multiple values.
Examples: "Prof. Robert F. Heizer [1915-1979]", "Dr. Alfred Emerson"
|objcolldate_s||An irregular string indicating the date that the object was collected from the field. For archaeological objects, this is the date that the object was excavated or otherwise found, and for ethnographic objects, this is the date that the object was purchased, presented, or found.
Examples: "Summer, 1925", "August 7, 1978"
|objcolldate_begin_dt||An ISO-8601 representation of the earliest possible collection date. Together with objcolldate_end_dt, forms a date range.
|objcolldate_end_dt||An ISO-8601 representation of the latest possible collection date. Together with objcolldate_begin_dt, forms a date range.
|objproddate_s||An irregular string indicating the date that the object was produced, minted, or otherwise manufactured.
Examples: "125 BC", "1680–1720"
|objproddate_begin_dt||An ISO-8601 representation of the earliest possible production date. Together with objproddate_end_dt, forms a date range.
|objproddate_end_dt||An ISO-8601 representation of the latest possible production date. Together with objproddate_begin_dt, forms a date range.
|objacqdate_ss||An irregular string indicating the date that the object was acquired by the Museum.|
|objacqdate_begin_dt||An ISO-8601 representation of the earliest possible acquisition date. Together with objacqdate_end_dt, forms a date range.
|objacqdate_end_dt||An ISO-8601 representation of the latest possible acquisition date. Together with objacqdate_begin_dt, forms a date range.
|objaccdate_ss||An irregular string indicating the date that the object was formally accessioned into the Museum's collections.|
|objaccdate_begin_dt||An ISO-8601 representation of the earliest possible accession date. Together with objaccdate_end_dt, forms a date range.
|objaccdate_end_dt||An ISO-8601 representation of the earliest possible accession date. Together with objaccdate_begin_dt, forms a date range.
|objaccno_ss||The accession number(s) for the cataloged object.
Examples: "Acc.2467", "Acc.634", "Acc.500BN"
|blob_ss||The GUID identifier of the media object (usually an image) in CollectionSpace; used in conjunction with other API parameters, it can be used to retrieve an image or one of its several derivatives. See http://wiki.collectionspace.org/display/collectionspace/Blob+Service+RESTful+API for details.|
Including Authentication Credentials to obtain API access
The API requires that authentication credentials be submitted with each API request (RESTful APIs are stateless, so there's no "logged in" session maintained between requests). The app_id and app_key parameters must be included in each cURL request to the API; a 403 Forbidden error will be returned if valid values for these parameters are missing from the request.
Here's an example—note that the authentication parameters are passed as part of the http header, not as query parameters:
curl -v -H "app_id:abcdefgh" -H "app_key:12345678901234567890123456789012" -X GET "https://apis.berkeley.edu/hearst_museum/select?q=objname_s:headdress&wt=json&indent=on"
Basic Solr URL parameters: query and specify response format
- The API only responds to the HTTP verb GET: The Hearst Museum is giving access to a read-only collection of materials.
- The fields described above are metadata. One of these (blob_ss) gives the physical location of the associated photos. Thus, for example, identifying an image of interest and obtaining it requires multiple requests against the API (and because multiple URLs may be included in the blob_ss field, parsing of that field's contents for the desired image among multiple options may also be required).
- When specifying field names, be aware that they are case-sensitive (query terms, however, are case-insensitive).
This parameter holds the query statement. A query must contain both the field to be searched and the term to search for, e.g., q=objassoccult_s:Kashaya.
Query terms are NOT case-sensitive. However, query field names ARE case sensitive.
The index returns matches where the search term is in the specified field, e.g., q=objtitle_s:view returns records in which the word "view" occurs anywhere in the title.
Multiple search terms can be included by placing a plus-sign between them, e.g.: q=Projectile+Point—this example will find instances of Projectile or Point. To query for an exact phrase, enclose it in double-quotes, e.g., q="Projectile+Point".
More complex queries, including queries on multiple fields and wildcards, are described in the Advanced query tips and examples section below.
The value of this param can be any of the following; xml is the default if the parameter is omitted.
Response formats can be examined by sending a request to the API. A convenient URL might be the one used to generate the attached zip file of response document examples (coming soon), which gives a small number of responses (simply vary the value of the wt parameter to see different formats; naturally, you must also substitute valid API app_id and app_key values):
|indent||"Pretty printed" response format||
If this parameter is not included, or is included and set to the value off—no extra whitespace will be included in the response to make it easy for humans to read. This is usually an appropriate choice if code is going to be used to parse the API's response document.
If this parameter is included and set to any value other than off, the response will be formatted so that it can be more easily read by humans. This is usually an appropriate choice for those investigating the API manually and inspecting the response document(s) visually.
Here’s an example. Note that the request queries for records that include the word "sandal" in the field objdescr_s; and specifies indented JSON as the response format.
curl -v -H "app_id:abcdefgh" -H "app_key:12345678901234567890123456789012" -X GET "https://apis.berkeley.edu/hearst_museum/select?q=objdescr_s:sandal&wt=json&indent=on&"
Advanced query tips and examples
Specify number of results returned
|rows||Specify number of rows to be returned, e.g. &rows=100||The default number of rows returned is 10.|
Paging results: get specific rows
|start||When included, this parameter specifies the offset in the complete result set for the queries where the set of returned documents should begin. (i.e. the first record that appears in the result set is the offset). E.g., &start=125||The default offset (i.e., when this parameter is not included) is zero (0). Result sets are indexed beginning with zero, i.e., the first record in the complete result set is specified as record 0 (zero), the second record is record 1, etc.|
|rows||Specify number of rows to be returned, e.g. &rows=25||The default number of rows returned is 10.|
Query on multiple fields
A set of query terms may include multiple fields, using the operators AND, OR, NOT, + or -. If no operators are used, OR is assumed (default).
The + operator requires the search term following the + to exist somewhere in the specified field for a result to be included in the returned set; the - operator excludes results in which the search term following the - appears.
Examples (note that spaces are not escaped below, for readability; in actual URLs, they must be replaced by %20 in every case):
- q=objname_s:Basket AND objassoccult_s:Pomo
- q=objname_s:Basket AND NOT objassoccult_s:Pomo
- q=objname_s:Basket AND -objassoccult_s:Pomo
- q=objname_s:Basket OR objassoccult_s:Pomo
- q=objname_s:Basket objassoccult_s:Pomo
Note that examples 2 and 3 returns the same result set; as do examples 4 and 5.
Specify fields to be returned
|fl||When this parameter is included, only the fields listed will be returned in a result set (empty fields may not be returned, depending on the response format requested). E.g., &fl=objname_s,objinscrtext_ss||The set of fields to be returned can be specified as a comma- (or space-) separated list of field names.|
Use wildcards in queries
Wildcards can be inserted in query terms at the start, middle, or end of a term. Asterisk (*) is the wildcard character. Considering also that query terms are case-insensitive, the following query field:term expressions all return the same result set:
|sort||Sort results by Solr-calculated relevancy score. E.g., &sort=score+asc or &sort=score+desc||The score pseudo-field is the only field in this API (besides id) that can be used to sort a result set. The sort direction can be ascending (asc) or descending (desc).|
Where can I learn more about querying the API?
The Hearst Museum's collections data API is backed by an instance of Apache Solr; the full Solr API is not described on this page. For example, faceted search is not described here, but is described in the Solr documentation. The Apache Solr Reference Guide (v4.8) is available in PDF form from multiple mirror sites listed here. The section on Specifying Terms for the Standard Query Parser may be of particular interest to HackTheHearst participants, as it gives a much more complete picture of how queries in requests to this API can be formulated.
You can use many tools—most web browsers, curl, Firefox's Poster plugin, etc.—to query the API. An especially convenient tool for querying the API with parameters described on this page will soon be available at UC Berkeley's API Central, where we'll provide browser-based interactive documentation to the Hearst Museum's collections data API.