Query¶
Query¶
- class marvin.tools.query.Query(search_filter=None, return_params=None, return_type=None, targets=None, quality=None, mode=None, return_all=False, default_params=None, nexus='cube', sort='mangaid', order='asc', caching=True, limit=100, count_threshold=1000, verbose=False, release=None)[source]¶
Bases:
object
A class to perform queries on the MaNGA dataset.
This class is the main way of performing a query. A query works by minimally specifying a string filter a string filter condition in a natural language SQL format, as well as, a list of desired parameters to return.
Query will use a local database if it finds on. Otherwise a remote query uses the API to run a query on the Utah Server and return the results.
The Query returns a list of tupled parameters and passed them into the Marvin Results object. The parameters are a combination of user-defined return parameters, parameters used in the filter condition, and a set of pre-defined default parameters. The object plateifu or mangaid is always returned by default. For queries involving DAP properties, the bintype, template, and spaxel x and y are also returned by default.
- Parameters
search_filter (str) – A (natural language) string containing the filter conditions in the query.
return_params (list) – A list of string parameter names desired to be returned in the query
return_type ({'cube', 'spaxel', 'maps', 'rss', and 'modelcube'}) – The requested Marvin Tool object that the results are converted into.
targets (list) – A list of manga_target flags to filter on
quality (list) – A list of quality flags to filter on
mode ({'local', 'remote', 'auto'}) – The load mode to use. See Mode secision tree.
return_all (bool) – If True, attempts to return the entire set of results. Default is False.
default_params (list) – Optionally specify additional parameters as defaults
sort (str) – The parameter name to sort the query on
order ({'asc', 'desc'}) – The sort order. Can be either ascending or descending.
limit (int) – The number limit on the number of returned results
count_threshold (int) – The threshold number to begin paginating results. Default is 1000.
nexus (str) – The name of the database table to use as the nexus point for building the join table tree. Can only be set in local mode.
caching (bool) – If True, turns on the dogpile memcache caching of results. Default is True.
verbose (bool) – If True, turns on verbosity.
- classmethod get_available_params(paramdisplay='best', release=None)[source]¶
Retrieve the available parameters to query on
Retrieves a list of the available query parameters. Can either retrieve a list of all the parameters or only the vetted parameters.
- Parameters
paramdisplay (str {all|best}) – String indicating to grab either all or just the vetted parameters. Default is to only return ‘best’, i.e. vetted parameters
- Returns
A list of all of the available queryable parameters
- run(start=None, end=None, query_type=None)[source]¶
Runs a Query
Runs a query either locally or remotely.
- Parameters
- Returns
An instance of the
Results
class containing the results of your Query.
Example
>>> # filter of "NSA redshift less than 0.1 and stellar mass > 1.e10" >>> searchfilter = 'nsa.z < 0.1 and nsa.elpetro_mass > 1.e10' >>> returnparams = ['cube.ra', 'cube.dec'] >>> q = Query(search_filter=searchfilter, return_params=returnparams) >>> results = q.run()
- show(prop='query')[source]¶
Prints into the console
Displays the query to the console with parameter variables plugged in. Works only in local mode. Input prop can be one of query, joins, or filter.
- Allowed Values for Prop:
query: displays the entire query (default if nothing specified)
joins: displays the tables that have been joined in the query
filter: displays only the filter used on the query
- Parameters
prop (str) – The type of info to print. Can be ‘query’, ‘joins’, or ‘filter’.
- Returns
The SQL string
- property nexus¶
Query Utils¶
- class marvin.utils.datamodel.query.base.ParameterGroup(name, items, parent=None)[source]¶
Bases:
QueryFuzzyList
A Query Parameter Group Object
Query parameters are grouped into specific categories for ease of use and navigation. This object subclasses from the Python list object.
- Parameters
- list_params(name_type=None, subset=None)[source]¶
List the parameter names for a given group
Lists the Query Parameters of the given group
- property display¶
- property full¶
- property parameters¶
- property remote¶
- property short¶
- class marvin.utils.datamodel.query.base.ParameterGroupList(items)[source]¶
Bases:
QueryFuzzyList
ParameterGroup Object
This object inherits from the Python list object. This represents a list of query ParameterGroups.
- list_groups()[source]¶
Returns a list of query groups.
- Returns
names (list) – A string list of all the Query Group names
- list_params(name_type='full', groups=None)[source]¶
Returns a list of parameters from all groups.
Return a string list of the full parameter names. Default is all parameters across all groups.
- property best¶
List the best parameters in each group
- property names¶
List all the parameter groups
- property parameters¶
List all the queryable parameters
- class marvin.utils.datamodel.query.base.QueryDataModel(release, groups=[], aliases=[], exclude=[], **kwargs)[source]¶
Bases:
object
A class representing a Query datamodel
- to_table(pprint=False, max_width=1000, only_best=False, db=False)[source]¶
Write the datamodel to an Astropy table
- property best¶
- property best_groups¶
- property groups¶
Returns the groups for this datamodel.
- property parameters¶
- class marvin.utils.datamodel.query.base.QueryDataModelList(models=None)[source]¶
Bases:
DataModelList
A dictionary of Query datamodels.
- base_model¶
alias of
QueryDataModel
- base = {'QueryDataModel': <class 'marvin.utils.datamodel.query.base.QueryDataModel'>}¶
- base_name = 'QueryDataModel'¶
- class marvin.utils.datamodel.query.base.QueryFuzzyList(the_list, use_fuzzy=None)[source]¶
Bases:
FuzzyList
Fuzzy List for Query Parameters
- class marvin.utils.datamodel.query.base.QueryList(items)[source]¶
Bases:
QueryFuzzyList
A class for a list of Query Parameters
- class marvin.utils.datamodel.query.base.QueryParameter(full, table=None, name=None, short=None, display=None, remote=None, dtype=None, **kwargs)[source]¶
Bases:
object
A Query Parameter class
An object representing a query parameter. Provides access to different names for a given parameter.
- Parameters
full (str) – The full naming syntax (table.name) used for all queries. This name is recommended for full uniqueness.
table (str) – The name of the database table the parameter belongs to
name (str) – The name of the parameter in the database
short (str) – A shorthand name of the parameter
display (str) – A display name used for web and plotting purposes.
dtype (str) – The type of the parameter (e.g. string, integer, float)
- Variables
property – The DAP Datamodel Property corresponding to this Query Parameter
- property db_column¶
- property db_schema¶
- property db_table¶
- marvin.utils.datamodel.query.base.get_allowed_releases()[source]¶
get the supported API / web MaNGA releases
- marvin.utils.datamodel.query.base.get_best_fuzzy(name, choices, cutoff=60, return_score=False)[source]¶
- marvin.utils.datamodel.query.base.is_supported_release(release)[source]¶
check a release against the list of supported API releases
- marvin.utils.datamodel.query.base.strip_mapped(self)[source]¶
Strip the mapped items for display with dir
Since __dir__ cannot have . in the attribute name, this strips the returned mapper(item) parameter of any . in the name. Used for query parameter syntax [table.parameter_name]
For cases where the parameter_name is “name”, and thus non-unique, it also returns the mapper name with “.” replaced with “_”, to make unique. “ifu.name” becomes “ifu_name”, etc.
- Parameters
self – a QueryFuzzyList object
- Returns
list of mapped named stripped of dots
Results¶
- class marvin.tools.results.ResultSet(_objects, **kwargs)[source]¶
Bases:
list
A Set of Results
A list object representing a set of query results. Each row of the list is a ResultRow object, which is a custom Marvin namedtuple object. ResultSets can be extended column-wise or row-wise by adding them together.
- Parameters
_objects (list) – A list of objects. Required.
count (int) – The count of objects in the current list
totalcount (int) – The total count of objects in the full results
index (int) – The index of the current set within the total set.
columns (list) – A list of columns accompanying this set
results (Results) – The Marvin Results object this set is a part of
- sort(name=None, reverse=False)[source]¶
Sort the results
In-place sorting of the result set. This is the standard list sorting mechanism. When no name is specified, does standard list sorting with no key.
- class marvin.tools.results.Results(results=None, mode=None, data_origin=None, release=None, count=None, totalcount=None, runtime=None, response_time=None, chunk=None, start=None, end=None, queryobj=None, query=None, search_filter=None, return_params=None, return_type=None, limit=None, params=None, **kwargs)[source]¶
Bases:
object
A class to handle results from queries on the MaNGA dataset
- Parameters
results (list) – List of results satisfying the input Query
query (object / str) – The query used to produce these results. In local mode, the query is an SQLalchemy object that can be used to redo the query, or extract subsets of results from the query. In remote more, the query is a literal string representation of the SQL query.
return_type (str) – The MarvinTools object to convert the results into. If initially set, the results are automaticaly converted into the specified Marvin Tool Object on initialization
objects (list) – The list of Marvin Tools objects created by returntype
count (int) – The number of objects in the returned query results
totalcount (int) – The total number of objects in the full query results
mode ({'auto', 'local', 'remote'}) – The load mode to use. See Mode secision tree.
chunk (int) – For paginated results, the number of results to return. Defaults to 10.
start (int) – For paginated results, the starting index value of the results. Defaults to 0.
end (int) – For paginated results, the ending index value of the resutls. Defaults to start+chunk.
- Variables
- Returns
results – An object representing the Results entity
Example
>>> f = 'nsa.z < 0.012 and ifu.name = 19*' >>> q = Query(search_filter=f) >>> r = q.run() >>> print(r) >>> Results(results=[(u'4-3602', u'1902', -9999.0), (u'4-3862', u'1902', -9999.0), (u'4-3293', u'1901', -9999.0), (u'4-3988', u'1901', -9999.0), (u'4-4602', u'1901', -9999.0)], >>> query=<sqlalchemy.orm.query.Query object at 0x115217090>, >>> count=64, >>> mode=local)
- convertToTool(tooltype, mode='auto', limit=None)[source]¶
Converts the list of results into Marvin Tool objects
Creates a list of Marvin Tool objects from a set of query results. The new list is stored in the Results.objects property. If the Query.returntype parameter is specified, then the Results object will automatically convert the results to the desired Tool on initialization.
- Parameters
tooltype (str) – The requested Marvin Tool object that the results are converted into. Overrides the returntype parameter. If not set, defaults to the returntype parameter.
limit (int) – Limit the number of results you convert to Marvin tools. Useful for extremely large result sets. Default is None.
mode (str) – The mode to use when attempting to convert to Tool. Default mode is to use the mode internal to Results. (most often remote mode)
Example
>>> # Get the results from some query >>> r = q.run() >>> r.results >>> [NamedTuple(mangaid=u'14-12', name=u'1901', nsa.z=-9999.0), >>> NamedTuple(mangaid=u'14-13', name=u'1902', nsa.z=-9999.0), >>> NamedTuple(mangaid=u'27-134', name=u'1901', nsa.z=-9999.0), >>> NamedTuple(mangaid=u'27-100', name=u'1902', nsa.z=-9999.0), >>> NamedTuple(mangaid=u'27-762', name=u'1901', nsa.z=-9999.0)]
>>> # convert results to Marvin Cube tools >>> r.convertToTool('cube') >>> r.objects >>> [<Marvin Cube (plateifu='7444-1901', mode='remote', data_origin='api')>, >>> <Marvin Cube (plateifu='7444-1902', mode='remote', data_origin='api')>, >>> <Marvin Cube (plateifu='7995-1901', mode='remote', data_origin='api')>, >>> <Marvin Cube (plateifu='7995-1902', mode='remote', data_origin='api')>, >>> <Marvin Cube (plateifu='8000-1901', mode='remote', data_origin='api')>]
- download(images=False, limit=None)[source]¶
Download results via sdss_access
Uses sdss_access to download the query results via rsync. Downloads them to the local sas. The data type downloaded is indicated by the returntype parameter
i.e. $SAS_BASE_DIR/mangawork/manga/spectro/redux/…
- Parameters
- Returns
NA – na
Example
>>> r = q.run() >>> r.returntype = 'cube' >>> r.download()
- extendSet(chunk=None, start=None)[source]¶
Extend the Result set with the next page
Extends the current ResultSet with the next page of results or a specified page. Calls either getNext or getSubset.
- Parameters
- Returns
A new results set
Example
>>> # run a query >>> r = q.run() >>> # extend the current result set with the next page >>> r.extendSet() >>>
See also
getNext, getSubset
- getAll(force=False)[source]¶
Retrieve all of the results of a query
Attempts to return all the results of a query. The efficiency of this method depends heavily on how many rows and columns you wish to return.
A cutoff limit is applied for results with more than 500,000 rows or results with more than 25 columns.
- Parameters
- force (bool):
If True, force attempt to download everything
- Returns
The full list of query results.
See also
getNext, getPrevious, getSubset, loop
- getColumns()[source]¶
Get the columns of the returned reults
Returns a ParameterGroup containing the columns from the returned results. Each row of the ParameterGroup is a QueryParameter.
- Returns
columns (list) – A list of column names from the results
Example
>>> r = q.run() >>> cols = r.getColumns() >>> print(cols) >>> [u'mangaid', u'name', u'nsa.z']
- getDictOf(name=None, format_type='listdict', to_json=False, return_all=None)[source]¶
Get a dictionary of specified parameters
- Parameters
name (str) – Name of the parameter name to return. If not specified, it returns all parameters.
format_type ({'listdict', 'dictlist'}) – The format of the results. Listdict is a list of dictionaries. Dictlist is a dictionary of lists. Default is listdict.
to_json (bool) – True/False boolean to convert the output into a JSON format
return_all (bool) – if True, returns the entire result set for that column
- Returns
output (list, dict) – Can be either a list of dictionaries, or a dictionary of lists
Example
>>> # get some results >>> r = q.run() >>> # Get a list of dictionaries >>> r.getDictOf(format_type='listdict') >>> [{'cube.mangaid': u'4-3988', 'ifu.name': u'1901', 'nsa.z': -9999.0}, >>> {'cube.mangaid': u'4-3862', 'ifu.name': u'1902', 'nsa.z': -9999.0}, >>> {'cube.mangaid': u'4-3293', 'ifu.name': u'1901', 'nsa.z': -9999.0}, >>> {'cube.mangaid': u'4-3602', 'ifu.name': u'1902', 'nsa.z': -9999.0}, >>> {'cube.mangaid': u'4-4602', 'ifu.name': u'1901', 'nsa.z': -9999.0}]
>>> # Get a dictionary of lists >>> r.getDictOf(format_type='dictlist') >>> {'cube.mangaid': [u'4-3988', u'4-3862', u'4-3293', u'4-3602', u'4-4602'], >>> 'ifu.name': [u'1901', u'1902', u'1901', u'1902', u'1901'], >>> 'nsa.z': [-9999.0, -9999.0, -9999.0, -9999.0, -9999.0]}
>>> # Get a dictionary of only one parameter >>> r.getDictOf('mangaid') >>> [{'cube.mangaid': u'4-3988'}, >>> {'cube.mangaid': u'4-3862'}, >>> {'cube.mangaid': u'4-3293'}, >>> {'cube.mangaid': u'4-3602'}, >>> {'cube.mangaid': u'4-4602'}]
- getListOf(name=None, to_json=False, to_ndarray=False, return_all=None)[source]¶
Extract a list of a single parameter from results
- Parameters
name (str) – Name of the parameter name to return. If not specified, it returns all parameters.
to_json (bool) – True/False boolean to convert the output into a JSON format
to_ndarray (bool) – True/False boolean to convert the output into a Numpy array
return_all (bool) – if True, returns the entire result set for that column
- Returns
output (list) – A list of results for one parameter
Example
>>> r = q.run() >>> r.getListOf('mangaid') >>> [u'4-3988', u'4-3862', u'4-3293', u'4-3602', u'4-4602']
- Raises
AssertionError – Raised when no name is specified.
- getNext(chunk=None)[source]¶
Retrieve the next chunk of results
Returns the next chunk of results from the query. from start to end in units of chunk. Used with getPrevious to paginate through a long list of results
- Parameters
chunk (int) – The number of objects to return
- Returns
results (list) – A list of query results
Example
>>> r = q.run() >>> r.getNext(5) >>> Retrieving next 5, from 35 to 40 >>> [(u'4-4231', u'1902', -9999.0), >>> (u'4-14340', u'1901', -9999.0), >>> (u'4-14510', u'1902', -9999.0), >>> (u'4-13634', u'1901', -9999.0), >>> (u'4-13538', u'1902', -9999.0)]
See also
getAll, getPrevious, getSubset
- getPrevious(chunk=None)[source]¶
Retrieve the previous chunk of results.
Returns a previous chunk of results from the query. from start to end in units of chunk. Used with getNext to paginate through a long list of results
- Parameters
chunk (int) – The number of objects to return
- Returns
results (list) – A list of query results
Example
>>> r = q.run() >>> r.getPrevious(5) >>> Retrieving previous 5, from 30 to 35 >>> [(u'4-3988', u'1901', -9999.0), >>> (u'4-3862', u'1902', -9999.0), >>> (u'4-3293', u'1901', -9999.0), >>> (u'4-3602', u'1902', -9999.0), >>> (u'4-4602', u'1901', -9999.0)]
See also
getNext, getAll, getSubset
- getSubset(start, limit=None)[source]¶
Extracts a subset of results
- Parameters
- Returns
results (list) – A list of query results
Example
>>> r = q.run() >>> r.getSubset(0, 10) >>> [(u'14-12', u'1901', -9999.0), >>> (u'14-13', u'1902', -9999.0), >>> (u'27-134', u'1901', -9999.0), >>> (u'27-100', u'1902', -9999.0), >>> (u'27-762', u'1901', -9999.0), >>> (u'27-759', u'1902', -9999.0), >>> (u'27-827', u'1901', -9999.0), >>> (u'27-828', u'1902', -9999.0), >>> (u'27-1170', u'1901', -9999.0), >>> (u'27-1167', u'1902', -9999.0)]
See also
getNext, getPrevious, getAll
- hist(name, **kwargs)[source]¶
Make a histogram for a given column of the results
Creates a Matplotlib histogram from a Results Column. Accepts as input a string column name. Will extract the total entire column (if not already available) and plot it.
See
marvin.utils.plot.scatter.hist()
for details.- Parameters
name (str) – The name of the column of data. Required
return_plateifus (bool) – If True, includes the plateifus in each histogram bin in the histogram output. Default is True.
return_figure (bool) – Set to False to not return the Figure and Axis object. Defaults to True.
show_plot (bool) – Set to False to not show the interactive plot
**kwargs (dict) – Any other keyword argument that will be passed to Marvin’s hist plotting methods
- Returns
The histogram data, figure, and axes from the plotting function
Example
>>> # do a query and get the results >>> q = Query(search_filter='nsa.z < 0.1', returnparams=['nsa.elpetro_ba', 'g_r']) >>> r = q.run() >>> # plot a histogram of the redshift column >>> hist_data, fig, axes = r.hist('nsa.z')
- loop(chunk=None)[source]¶
Loop over the full set of results
Starts a loop to collect all the results (in chunks) until the current count reaches the total number of results. Uses extendSet.
- Parameters
chunk (int) – The number of objects to return
Example
>>> # get some results from a query >>> r = q.run() >>> # start a loop, grabbing in chunks of 400 >>> r.loop(chunk=400)
- merge_tables(tables, direction='vert', **kwargs)[source]¶
Merges a list of Astropy tables of results together
Combines two Astropy tables using either the Astropy vstack or hstack method. vstack refers to vertical stacking of table rows. hstack refers to horizonal stacking of table columns. hstack assumes the rows in each table refer to the same object. Buyer beware: stacking tables without proper understanding of your rows and columns may results in deleterious results.
merge_tables also accepts all keyword arguments that Astropy vstack and hstack method do. See vstack See hstack
- Parameters
- Returns
A new Astropy table that is the stacked combination of all input tables
Example
>>> # query 1 >>> q, r = doQuery(search_filter='nsa.z < 0.1', returnparams=['g_r', 'cube.ra', 'cube.dec']) >>> # query 2 >>> q2, r2 = doQuery(search_filter='nsa.z < 0.1') >>> >>> # convert to tables >>> table_1 = r.toTable() >>> table_2 = r2.toTable() >>> tables = [table_1, table_2] >>> >>> # vertical (row) stacking >>> r.merge_tables(tables, direction='vert') >>> # horizontal (column) stacking >>> r.merge_tables(tables, direction='hor')
- plot(x_name, y_name, **kwargs)[source]¶
Make a scatter plot from two columns of results
Creates a Matplotlib scatter plot from Results columns. Accepts as input two string column names. Will extract the total entire column (if not already available) and plot them. Creates a scatter plot with (optionally) adjoining 1-d histograms for each column.
See
marvin.utils.plot.scatter.plot()
andmarvin.utils.plot.scatter.hist()
for details.- Parameters
x_name (str) – The name of the x-column of data. Required
y_name (str) – The name of the y-column of data. Required
return_plateifus (bool) – If True, includes the plateifus in each histogram bin in the histogram output. Default is True.
return_figure (bool) – Set to False to not return the Figure and Axis object. Defaults to True.
show_plot (bool) – Set to False to not show the interactive plot
**kwargs (dict) – Any other keyword argument that will be passed to Marvin’s scatter and hist plotting methods
- Returns
The figure, axes, and histogram data from the plotting function
Example
>>> # do a query and get the results >>> q = Query(search_filter='nsa.z < 0.1', returnparams=['nsa.elpetro_ba', 'g_r']) >>> r = q.run() >>> # plot the total columns of Redshift vs g-r magnitude >>> fig, axes, hist_data = r.plot('nsa.z', 'g_r')
- showQuery()[source]¶
Displays the literal SQL query used to generate the Results objects
- Returns
querystring (str) – A string representation of the SQL query
- sort(name, order='asc')[source]¶
Sort the set of results by column name
Sorts the results (in place) by a given parameter / column name. Sets the results to the new sorted results.
- Parameters
name (str) – The column name to sort on
order ({'asc', 'desc'}) – To sort in ascending or descending order. Default is asc.
Example
>>> r = q.run() >>> r.getColumns() >>> [u'mangaid', u'name', u'nsa.z'] >>> r.results >>> [(u'4-3988', u'1901', -9999.0), >>> (u'4-3862', u'1902', -9999.0), >>> (u'4-3293', u'1901', -9999.0), >>> (u'4-3602', u'1902', -9999.0), >>> (u'4-4602', u'1901', -9999.0)]
>>> # Sort the results by mangaid >>> r.sort('mangaid') >>> [(u'4-3293', u'1901', -9999.0), >>> (u'4-3602', u'1902', -9999.0), >>> (u'4-3862', u'1902', -9999.0), >>> (u'4-3988', u'1901', -9999.0), >>> (u'4-4602', u'1901', -9999.0)]
>>> # Sort the results by IFU name in descending order >>> r.sort('ifu.name', order='desc') >>> [(u'4-3602', u'1902', -9999.0), >>> (u'4-3862', u'1902', -9999.0), >>> (u'4-3293', u'1901', -9999.0), >>> (u'4-3988', u'1901', -9999.0), >>> (u'4-4602', u'1901', -9999.0)]
- toCSV(filename='myresults.csv', overwrite=False)[source]¶
Output the results as a CSV file
Writes a new CSV file from search results using the astropy Table.write()
- toDataFrame()[source]¶
Output the results as an pandas dataframe.
Uses the pandas package.
- Parameters
None –
- Returns
dfres – pandas dataframe
Example
>>> r = q.run() >>> r.toDataFrame() mangaid plate name nsa_mstar z 0 1-22286 7992 12704 1.702470e+11 0.099954 1 1-22301 7992 6101 9.369260e+10 0.105153 2 1-22414 7992 6103 7.489660e+10 0.092272 3 1-22942 7992 12705 8.470360e+10 0.104958 4 1-22948 7992 9102 1.023530e+11 0.119399
- toFits(filename='myresults.fits', overwrite=False)[source]¶
Output the results as a FITS file
Writes a new FITS file from search results using the astropy Table.write()
- toJson(orient: str = 'records', pure: Optional[bool] = None) str [source]¶
Output the results as a JSON object
Uses Python panda package to convert the results to a JSON object. The default orientation is a list “records”. Valid orientations are (‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’). If pandas is not installed or the “pure” option is set, will use the json package to convert the results to JSON representation.
- toTable()[source]¶
Output the results as an Astropy Table
Uses the Python Astropy package
- Parameters
None –
- Returns
tableres – Astropy Table
Example
>>> r = q.run() >>> r.toTable() >>> <Table length=5> >>> mangaid name nsa.z >>> unicode6 unicode4 float64 >>> -------- -------- ------------ >>> 4-3602 1902 -9999.0 >>> 4-3862 1902 -9999.0 >>> 4-3293 1901 -9999.0 >>> 4-3988 1901 -9999.0 >>> 4-4602 1901 -9999.0