HackTheHearst API Detail

This page includes information about the API to the Phoebe A. Hearst Museum of Anthropology's digital collections data. Use of the API implies agreement with its Terms of Service and Disclaimer, given on HackTheHearst's main API page (parent to this page).

The API is backed by an Apache Solr instance; the full Solr API is not described on this page. For example, faceted search is not described here, but is described in the Solr documentation. To learn about ways to query the Solr API not covered on this page, consult the Apache Solr Reference Guide (also available as a PDF).

Note that authorization credentials (the app_id and app_key parameters in API request URLs) must be submitted as URL parameters in order to access the API. App IDs and Keys will be distributed to teams at the HackTheHearst kickoff. Interactive documentation—a web page in which parameters to the API may be entered, and response documents returned when a request with those parameters is sent to the API—will be linked from this page by the time the HackTheHearst kickoff begins.


External References:

Fields retrievable via the Hearst Museum Solr API

The fields given in the list below are metadata to items in the Hearst Museum digital collections data. The primary objects they reference are physical objects in the Hearst Museum's collections. Images of, or associated with, these primary objects may be accessed via the URLs given in the blob_ss field.

The first table shown is a quick overview of the fields, and the second table covers the same fields in greater detail.

Field Name Brief descriptions
csid_s Primary key for object
objmusno_s The number assigned to a specific cataloged object
objsortnum_s A sortable version of above
objaltnum_ss Other numbers that were previously assigned to the object
objtype_s The general type of object (e.g., archaeology versus ethnography)
objname_s A name for the cataloged object
objcount_s The piece count for the cataloged object
objcountnote_s Notes regarding the piece count
objtitle_s A title for the cataloged object
objdescr_s A description of the cataloged object
objdept_s The primary "catalog" to which the cataloged object belongs
objfcp_s The field collection place of the object
objfcpverbatim_s A verbatim statement of provenience
objfcpgeoloc_s The latitude and longitude of the field collection place
objfcpelevation_s The elevation of the field collection place
objfcptree_ss The higher-order geographic place names for the field collection place
objassoccult_s The cultural group(s) associated with the object in its original context
objculturetree_ss The higher-order cultural group names for the associated cultural group(s)
objkeelingser_s Used for audio recordings only: The Keeling series number
objinventory_s The name of the relevant NAGPRA inventory
objfilecode_ss An indication of the original general use of the cataloged object
objcontextuse_s The context in which the object was originally used
objdimensions_ss Measurements of the cataloged object
objmaterials_ss Materials from which the cataloged object was made
objinscrtext_ss Textual inscriptions placed found on the cataloged object
objcomment_s Additional, relevant information about an object
objcollector_ss The name(s) of the person(s) who collected the cataloged object
objcolldate_s The date that the object was collected from the field (as text)
objcolldate_begin_dt The earliest possible collection date (as ISO-8601 date)
objcolldate_end_dt The latest possible collection date (as ISO-8601 date)
objproddate_s The date that the object was produced (as text)
objproddate_begin_dt The earliest possible production date (as ISO-8601 date)
objproddate_end_dt The latest possible production date (as ISO-8601 date)
objacqdate_ss The date that the object was acquired by the Museum (as text)
objacqdate_begin_dt The earliest possible acquisition date (as ISO-8601 date)
objacqdate_end_dt The latest possible acquisition date (as ISO-8601 date)
objaccdate_ss The date that the object was accessioned into the Museum (as text)
objaccdate_begin_dt The earliest possible accession date (as ISO-8601 date)
objaccdate_end_dt The latest possible accession date (as ISO-8601 date)
objaccno_ss The accession number(s) for the cataloged object
blob_ss The GUID identifier of the associated media objects

The following table provides additional descriptive detail for each of these fields:

Field Name Fuller descriptions
csid_s A unique internal identifier for the catalog record. The object csid is in the form of a GUID.

Example: "70d40782-6d11-4346-bb9b-2f85f1e00e91".
objmusno_s The identifying number of the catalog record, called a Museum Number.

Examples: "11-37781", "K-3720a,b" or "2013.01.0001"

Note: Until 2012, Museum numbers were created and assigned following this pattern:
   [catalog designation]-[object number][optional suffix]
For example, "18-1339a,b" or "18-319g"
The catalog designations are generally 1- or 2-digit numbers or a single letter.
The object numbers are sequentially assigned integers
The suffixes are optional and are used to specify sub-objects within a catalog record.

Since 2012, a more conventional system of numbering has been adopted, following this pattern:
   [4-digit year].[2-digit accession number].[4-digit object number]
The accession number resets to 01 with each new year.
The object number resets to 0001 with each new accession.
objsortnum_s A formatted string that, when used for sorting, returns objects in their correct order (i.e., "2-30" comes after "2-4" and before "2-1000"). Used for sorting only.

Examples: "000011 022681 aj", "000008 000511 m"
objaltnum_ss An alternate number for an object. This field can have multiple values.. Each value is a concatenation of the alternate number, the type of alternate number (in parentheses), and an optional note about the alternate number (which is also set inside the parentheses, when present).

Example: "1 (song number)"
objtype_s A short controlled string indicating the general type of object represented by the catalog record.

Examples: "(not specified)", "none (Registration)", "archaeology", "ethnography", "documentation", "sample", "indeterminate", "unknown".
objname_s An uncontrolled string that provides a short name for the object.

Example: "model canoe"
objcount_s The number of objects or pieces that comprise the catalog record. For instance, a quiver with four arrows might be cataloged together as a single catalog record and given a single Museum Number, but the object count will be set to 5 (1 quiver + 4 arrows).
objcountnote_s An uncontrolled text field which can contain additional information about the count.

Examples: "weighed but not counted", "13 bags of fragments"
objtitle_s A concatenation of the title assigned to an object and the type of title it is.

Examples: "Petroglyph, People (Title Subject)", "Tallac House, Lake Tahoe. :1009 (Artist's Label)"
objdescr_s A fuller, but still terse, description of the object. The description may include information from other fields, such as materials, dimensions, and dates.

Example: "Medal; AE (copper, brass, or bronze) and gold; obverse: Eagle with Snake in beak, cactus; surrounded by wreath, flags, symbols of Spanish-Mexico; reverse: scene of place of battle, cannon and flags; surrounded by inscription; 1829; 1 1/2 x 1 15/16 inch oval. Recatalogued to 3-16176."
objdept_s A controlled field which contains a brief description of the primary curatorial area of an object. Now obsolete, but still helpful in many cases; for example, when an object has poor metadata.

Examples: "Cat. 11 - Oceania (incl. Australia)", "Cat. 6 - Ancient Egypt (the Hearst Reisner Egyptian Collection)"
objfcp_s The field collection place of the object. For archaeological objects, this is usually the archaeological site; for ethnographic objects, this is usually the place where the object was purchased or otherwise obtained.

Examples: "Santa Rosa, Sonoma county, California", "San Pedro de los Conchos, Chihuahua state, Mexico"
objfcpverbatim_s The verbatim text entered in the "provenience" field on the original paper catalog cards. This field often contains outdated place names (e.g., Northern Rhodesia), and will often contain information about the associated cultural group and/or the name of the maker.
objfcpgeoloc_s The decimal latitude and longitude of the field collection place, presented in the form of latitude, longitude.

Example: "-8.11243825276631725, -79.0745471790682473"
objfcpelevation_s An expression of the elevation (in meters above sea level) of the field collection place. Note: this field is a text field, and can contain values in different formats.

Examples: "81", "1264 m", or "1000 m (avg)".
objfcptree_ss A field that contains the display names of the field collection place and all of the parent places of the field collection place, up to the level of continent. This field can have multiple values.

Example (all are values for the same object): "Oceania", "Micronesia, Oceania", "Caroline Islands, Micronesia, Oceania", "Federated States of Micronesia, Micronesia, Oceania", "Yap State, Federated States of Micronesia, Micronesia", "Yap Island, Yap State, Federated States of Micronesia", "Weloy municipality, Yap Island, Yap State", "Damkil site, Weloy municipality, Yap Island"
objassoccult_s A controlled string expression that indicates the cultural group(s) associated with the object in its original context.

Examples: "Pomo", "Yurok"
objculturetree_ss A field that contains the display names of the associated cultural group and of all the parent culture groups of the associated cultural group. This field can have multiple values, and may contain values frommultiple hierarchies if there are multiple associated cultures. An "@" symbol indicates that the associated term is a guide term.

Example (all are values for the same object): "@The Americas", "@North America", "@North American native cultures", "@California tribes", "@Northern California tribes", "@Northwestern California tribes", "Klamath River Tribes", "Yurok"
objkeelingser_s (Used for audio recordings only). The number and name of the series of recordings to which this object belongs. These series were described by Richard Keeling in his 1990 book, A Guide to Early Field Recordsings (1900–1949) at the Lowie Museum of Anthropology.
objinventory_s A controlled field indicating the name of the submitted document ("inventory") on which the object was reported to the National Park Service to comply with NAGPRA (the Native American Graves Protection and Repatriation Act). The Museum has submitted 208 NAGPRA inventories, and the complete list can be viewed here.

Examples: "caSanFranciscoCounty1", "caChumash2", "caCco138"
objfilecode_ss An indication of the classification of ethnographic objects according to criteria of use and function. The values in this field begin with a code in the form of "#.#". The first digit of this code indicates the general classification of the object:
  1. Utensils, implements, and conveyances
  2. Secular dress and accoutrements and adornment
  3. Status objects and insignia of office
  4. Structures and furnishings
  5. Ritual, pageantry and recreation
  6. Child care and enculturation
  7. Communication, records, currency, and measures
  8. Raw materials
The meaning of the second digit is stated in a brief string following the two-digit code. A full list of all ethnographic file codes, their meanings, and examples, can be found here.

Example: "2.3 Special ornaments, garb, and finery worn to battle by warriors (excluding status insignia)".
objcontextuse_s The context in which the object was originally used.

Example: "worn as ornamentation in hair during jump dance."
objdimensions_ss A concatenation of up to four fields of information:
  1. The name of the part that was measured (e.g., "handle"—not required)
  2. The dimension that was measured (e.g., "length"—required)
  3. The value of the dimension (e.g., "5.5"—required)
  4. The units of measurement that were used (e.g., "centimeters"—required)
This field can have multiple values.

Examples: "height 10.5 centimeters", "duration 123 seconds"
objmaterials_ss A concatenation of up to three fields of information:
  1. The material that was documented (e.g., "marble"—required)
  2. The component of the object made of this material (e.g., "frame"—not required)
  3. A note about this material (e.g., "burnished"—not required)
This field can have multiple values, which are delimited by a pipe character ("|").

Examples: "Obsidian", "Cotton", "Ivory", "Ceramic|Shell"
objinscrtext_ss A description and transcription of all textual inscriptions placed on the object.

Example: "Obverse legend: IMP GORDIANVS PIVS FEL AVG|Reverse legend: P M T R P IIII COS III P P |Reverse exergue: SC"
objcomment_s Additional, relevant information about an object, but that is out of scope for inclusion in the object description.

Example: "Photo: 15-8656. Published: Ill. C.D. Forde, UCPAAE 28:4, Pl. 52. Remarks: \"originally included 1-27074a--5 gourds--They have been given a new # 1-234129.\"" (NB: the included quotation marks have been escaped in this example)
objcollector_ss The name(s) of the person(s) who collected this object, often an archaeologist or ethnographer. This field can have multiple values.

Examples: "Prof. Robert F. Heizer [1915-1979]", "Dr. Alfred Emerson"
objcolldate_s An irregular string indicating the date that the object was collected from the field. For archaeological objects, this is the date that the object was excavated or otherwise found, and for ethnographic objects, this is the date that the object was purchased, presented, or found.

Examples: "Summer, 1925", "August 7, 1978"
objcolldate_begin_dt An ISO-8601 representation of the earliest possible collection date. Together with objcolldate_end_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objcolldate_end_dt An ISO-8601 representation of the latest possible collection date. Together with objcolldate_begin_dt, forms a date range.

Example: "1937-01-01T19:00:00Z"
objproddate_s An irregular string indicating the date that the object was produced, minted, or otherwise manufactured.

Examples: "125 BC", "1680–1720"
objproddate_begin_dt An ISO-8601 representation of the earliest possible production date. Together with objproddate_end_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objproddate_end_dt An ISO-8601 representation of the latest possible production date. Together with objproddate_begin_dt, forms a date range.

Example: "1937-01-01T19:00:00Z"
objacqdate_ss An irregular string indicating the date that the object was acquired by the Museum.
objacqdate_begin_dt An ISO-8601 representation of the earliest possible acquisition date. Together with objacqdate_end_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objacqdate_end_dt An ISO-8601 representation of the latest possible acquisition date. Together with objacqdate_begin_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objaccdate_ss An irregular string indicating the date that the object was formally accessioned into the Museum's collections.
objaccdate_begin_dt An ISO-8601 representation of the earliest possible accession date. Together with objaccdate_end_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objaccdate_end_dt An ISO-8601 representation of the earliest possible accession date. Together with objaccdate_begin_dt, forms a date range.

Example: "1932-01-01T19:00:00Z"
objaccno_ss The accession number(s) for the cataloged object.

Examples: "Acc.2467", "Acc.634", "Acc.500BN"
blob_ss The GUID identifier of the media object (usually an image) in CollectionSpace; used in conjunction with other API parameters, it can be used to retrieve an image or one of its several derivatives. See http://wiki.collectionspace.org/display/collectionspace/Blob+Service+RESTful+API for details.

Including Authentication Credentials to obtain API access

The API requires that authentication credentials be submitted with each API request (RESTful APIs are stateless, so there's no "logged in" session maintained between requests). The app_id and app_key parameters must be included in each cURL request to the API; a 403 Forbidden error will be returned if valid values for these parameters are missing from the request.

Here's an example—note that the authentication parameters are passed as part of the http header, not as query parameters:

curl -v -H "app_id:abcdefgh" -H "app_key:12345678901234567890123456789012" -X GET 
"https://apis.berkeley.edu/hearst_museum/select?q=objname_s:headdress&wt=json&indent=on"

Basic Solr URL parameters: query and specify response format

NOTES:

  • The API only responds to the HTTP verb GET: The Hearst Museum is giving access to a read-only collection of materials.
  • The fields described above are metadata. One of these (blob_ss) gives the physical location of the associated photos. Thus, for example, identifying an image of interest and obtaining it requires multiple requests against the API (and because multiple URLs may be included in the blob_ss field, parsing of that field's contents for the desired image among multiple options may also be required).
  • When specifying field names, be aware that they are case-sensitive (query terms, however, are case-insensitive).

Request Param Function Notes
q Query

This parameter holds the query statement. A query must contain both the field to be searched and the term to search for, e.g., q=objassoccult_s:Kashaya.

Query terms are NOT case-sensitive. However, query field names ARE case sensitive.

The index returns matches where the search term is in the specified field, e.g., q=objtitle_s:view returns records in which the word "view" occurs anywhere in the title.

Multiple search terms can be included by placing a plus-sign between them, e.g.: q=Projectile+Point—this example will find instances of Projectile or Point. To query for an exact phrase, enclose it in double-quotes, e.g., q="Projectile+Point".

More complex queries, including queries on multiple fields and wildcards, are described in the Advanced query tips and examples section below.

wt Response format

The value of this param can be any of the following; xml is the default if the parameter is omitted.

  • xml
  • json
  • csv
  • ruby
  • python
  • php

Response formats can be examined by sending a request to the API. A convenient URL might be the one used to generate the attached zip file of response document examples (coming soon), which gives a small number of responses (simply vary the value of the wt parameter to see different formats; naturally, you must also substitute valid API app_id and app_key values):

curl -v -H "app_id:abcdefgh" -H "app_key:12345678901234567890123456789012"
-X GET "https://apis.berkeley.edu/hearst_museum/select?
q=objcollector_s:Kroeber&wt=json&indent=on"
indent "Pretty printed" response format

If this parameter is not included, or is included and set to the value off—no extra whitespace will be included in the response to make it easy for humans to read. This is usually an appropriate choice if code is going to be used to parse the API's response document.

If this parameter is included and set to any value other than off, the response will be formatted so that it can be more easily read by humans. This is usually an appropriate choice for those investigating the API manually and inspecting the response document(s) visually.

Here’s an example. Note that the request queries for records that include the word "sandal" in the field objdescr_s; and specifies indented JSON as the response format.

curl -v -H "app_id:abcdefgh" -H "app_key:12345678901234567890123456789012" -X GET 
"https://apis.berkeley.edu/hearst_museum/select?q=objdescr_s:sandal&wt=json&indent=on&"

Advanced query tips and examples

Specify number of results returned
Paging results: get specific rows
Query on multiple fields
Specify fields to be returned
Use wildcards in queries
Sorting results
Where can I learn more about querying the API?


Specify number of results returned

Request Param Function Notes
rows Specify number of rows to be returned, e.g. &rows=100 The default number of rows returned is 10.

Top of page | Top of Advanced query tips and examples

Paging results: get specific rows

Request Param Function Notes
start When included, this parameter specifies the offset in the complete result set for the queries where the set of returned documents should begin. (i.e. the first record that appears in the result set is the offset). E.g., &start=125 The default offset (i.e., when this parameter is not included) is zero (0). Result sets are indexed beginning with zero, i.e., the first record in the complete result set is specified as record 0 (zero), the second record is record 1, etc.
rows Specify number of rows to be returned, e.g. &rows=25 The default number of rows returned is 10.

Top of page | Top of Advanced query tips and examples

Query on multiple fields

A set of query terms may include multiple fields, using the operators AND, OR, NOT, + or -. If no operators are used, OR is assumed (default).

The + operator requires the search term following the + to exist somewhere in the specified field for a result to be included in the returned set; the - operator excludes results in which the search term following the - appears.

Examples (note that spaces are not escaped below, for readability; in actual URLs, they must be replaced by %20 in every case):

  1. q=objname_s:Basket AND objassoccult_s:Pomo
  2. q=objname_s:Basket AND NOT objassoccult_s:Pomo
  3. q=objname_s:Basket AND -objassoccult_s:Pomo
  4. q=objname_s:Basket OR objassoccult_s:Pomo
  5. q=objname_s:Basket objassoccult_s:Pomo

Note that examples 2 and 3 returns the same result set; as do examples 4 and 5.

Top of page | Top of Advanced query tips and examples

Specify fields to be returned

Request Param Function Notes
fl When this parameter is included, only the fields listed will be returned in a result set (empty fields may not be returned, depending on the response format requested). E.g., &fl=objname_s,objinscrtext_ss The set of fields to be returned can be specified as a comma- (or space-) separated list of field names.

Top of page | Top of Advanced query tips and examples

Use wildcards in queries

Wildcards can be inserted in query terms at the start, middle, or end of a term. Asterisk (*) is the wildcard character. Considering also that query terms are case-insensitive, the following query field:term expressions all return the same result set:

Top of page | Top of Advanced query tips and examples

Sorting results

Request Param Function Notes
sort Sort results by Solr-calculated relevancy score. E.g., &sort=score+asc or &sort=score+desc The score pseudo-field is the only field in this API (besides id) that can be used to sort a result set. The sort direction can be ascending (asc) or descending (desc).

Top of page | Top of Advanced query tips and examples

Where can I learn more about querying the API?

The Hearst Museum's collections data API is backed by an instance of Apache Solr; the full Solr API is not described on this page. For example, faceted search is not described here, but is described in the Solr documentation. The Apache Solr Reference Guide (v4.8) is available in PDF form from multiple mirror sites listed here. The section on Specifying Terms for the Standard Query Parser may be of particular interest to HackTheHearst participants, as it gives a much more complete picture of how queries in requests to this API can be formulated.

You can use many tools—most web browsers, curl, Firefox's Poster plugin, etc.—to query the API. An especially convenient tool for querying the API with parameters described on this page will soon be available at UC Berkeley's API Central, where we'll provide browser-based interactive documentation to the Hearst Museum's collections data API.

Top of page | Top of Advanced query tips and examples