marvin.tools.query.
Query
(*args, **kwargs)[source]¶Bases: object
A class to perform queries on the MaNGA dataset.
This class is the main way of performing a query. A query works minimally by specifying a list of desired parameters, along with a string filter condition in a natural language SQL format.
A local mode query assumes a local database. A remote mode query uses the API to run a query on the Utah server, and return the results.
By default, the query returns a list of tupled parameters. The parameters are a combination of user-defined parameters, parameters used in the filter condition, and a set of pre-defined default parameters. The object plate-IFU or mangaid is always returned by default.
Parameters: |
|
---|---|
Returns: | results – An instance of the |
Example
>>> # filter of "NSA redshift less than 0.1 and IFU names starting with 19"
>>> searchfilter = 'nsa.z < 0.1 and ifu.name = 19*'
>>> returnparams = ['cube.ra', 'cube.dec']
>>> q = Query(searchfilter=searchfilter, returnparams=returnparams)
>>> results = q.run()
add_condition
()[source]¶Loop over all input forms and add a filter condition based on the input parameter form data.
getPercent
(fxn, **kwargs)[source]¶Query - Computes count comparisons
Retrieves the number of objects that have satisfy a given expression in x% of good spaxels. Expression is of the form Parameter Operand Value. This function is mapped to the “npergood” filter name.
Syntax: fxnname(expression) operator value
Parameters: | fxn (str) – The function condition used in the query filter |
---|
Example
>>> fxn = 'npergood(junk.emline_gflux_ha_6564 > 25) >= 20'
>>> Syntax: npergood() - function name
>>> npergood(expression) operator value
>>>
>>> Select objects that have Ha flux > 25 in more than
>>> 20% of their (good) spaxels.
get_available_params
(paramdisplay='best')[source]¶Retrieve the available parameters to query on
Retrieves a list of the available query parameters. Can either retrieve a list of all the parameters or only the vetted parameters.
Parameters: | paramdisplay (str {all|best}) – String indicating to grab either all or just the vetted parameters. Default is to only return ‘best’, i.e. vetted parameters |
---|---|
Returns: | qparams (list) – a list of all of the available queryable parameters |
restore
(path, delete=False)[source]¶Restore a pickled object
Parameters: | |
---|---|
Returns: | Query (instance) – The instantiated Marvin Query class |
run
(start=None, end=None, raw=None, orm=None, core=None)[source]¶Runs a Marvin Query
Runs the query and return an instance of Marvin Results class to deal with results.
Parameters: | |
---|---|
Returns: | results (object) – An instance of the Marvin Results class containing the results from the Query. |
save
(path=None, overwrite=False)[source]¶Save the query as a pickle object
Parameters: | |
---|---|
Returns: | path (str) – The filepath and name of the pickled object |
set_defaultparams
()[source]¶Loads the default params for a given return type TODO - change mangaid to plateifu once plateifu works in
cube, maps, rss, modelcube - file objects spaxel, map, rssfiber - derived objects (no file)
these are also the default params except any query on spaxelprop should return spaxel_index (x/y)
Minimum parameters to instantiate a Marvin Tool cube - return plateifu/mangaid modelcube - return plateifu/mangaid, bintype, template rss - return plateifu/mangaid maps - return plateifu/mangaid, bintype, template spaxel - return plateifu/mangaid, spaxel x and y
map - do not instantiate directly (plateifu/mangaid, bintype, template, property name, channel) rssfiber - do not instantiate directly (plateifu/mangaid, fiberid)
return any of our tools
set_filter
(searchfilter=None)[source]¶Parses a filter string and adds it into the query.
Parses a natural language string filter into the appropriate SQL filter syntax. String is a boolean join of one or more conditons of the form “PARAMETER_NAME OPERAND VALUE”
Parameter names must be uniquely specified. For example, nsa.z is a unique parameter name in the database and can be specified thusly. On the other hand, name is not a unique parameter name in the database, and must be clarified with the desired table.
AND | OR | NOT
In the absence of parantheses, the precedence of joins follow: NOT > AND > OR
== | != | <= | >= | < | > | =
Operand == maps to a strict equality (x == 5 –> x is equal to 5)
Operand = maps to SQL LIKE
(x = 5 –> x contains the string 5; x = ‘%5%’)
(x = 5* –> x starts with the string 5; x = ‘5%’)
(x = *5 –> x ends with the string 5; x = ‘%5’)
Parameters: | searchfilter (str) – A (natural language) string containing the filter conditions in the query; written as you would say it. |
---|
Example
>>> # Filter string
>>> filter = "nsa.z < 0.012 and ifu.name = 19*"
>>> # Converts to
>>> and_(nsa.z<0.012, ifu.name=19*)
>>> # SQL syntax
>>> mangasampledb.nsa.z < 0.012 AND lower(mangadatadb.ifudesign.name) LIKE lower('19%')
>>> # Filter string
>>> filter = 'cube.plate < 8000 and ifu.name = 19 or not (nsa.z > 0.1 or not cube.ra > 225.)'
>>> # Converts to
>>> or_(and_(cube.plate<8000, ifu.name=19), not_(or_(nsa.z>0.1, not_(cube.ra>225.))))
>>> # SQL syntax
>>> mangadatadb.cube.plate < 8000 AND lower(mangadatadb.ifudesign.name) LIKE lower(('%' || '19' || '%'))
>>> OR NOT (mangasampledb.nsa.z > 0.1 OR mangadatadb.cube.ra <= 225.0)
set_returnparams
(returnparams)[source]¶Loads the user input parameters into the query params limit
Adds a list of string parameter names into the main list of query parameters to return
Parameters: | returnparams (list) – A string list of the parameters you wish to return in the query |
---|
show
(prop=None)[source]¶Prints into to the console
Displays the query to the console with parameter variables plugged in. Works only in local mode. Input prop can be one of Can be one of query, tables, joins, or filter.
Only works in LOCAL mode.
Parameters: | prop (str) – The type of info to print. |
---|
Example
TODO add example
marvin.utils.datamodel.query.base.
ParameterGroup
(name, items, parent=None)[source]¶Bases: marvin.utils.datamodel.query.base.QueryFuzzyList
A Query Parameter Group Object
Query parameters are grouped into specific categories for ease of use and navigation. This object subclasses from the Python list object.
Parameters: |
|
---|
list_params
(name_type=None, subset=None)[source]¶List the parameter names for a given group
Lists the Query Parameters of the given group
Parameters: |
|
---|---|
Returns: | param (list) – The list of parameter |
display
¶full
¶parameters
¶remote
¶short
¶marvin.utils.datamodel.query.base.
ParameterGroupList
(items)[source]¶Bases: marvin.utils.datamodel.query.base.QueryFuzzyList
ParameterGroup Object
This object inherits from the Python list object. This represents a list of query ParameterGroups.
list_groups
()[source]¶Returns a list of query groups.
Returns: | names (list) – A string list of all the Query Group names |
---|
list_params
(name_type='full', groups=None)[source]¶Returns a list of parameters from all groups.
Return a string list of the full parameter names. Default is all parameters across all groups.
Parameters: |
|
---|---|
Returns: | params (list) – A list of full parameter names |
best
¶List the best parameters in each group
names
¶List all the parameter groups
parameters
¶List all the queryable parameters
marvin.utils.datamodel.query.base.
QueryDataModel
(release, groups=[], aliases=[], exclude=[], **kwargs)[source]¶Bases: object
A class representing a Query datamodel
to_table
(pprint=False, max_width=1000, only_best=False, db=False)[source]¶Write the datamodel to an Astropy table
best
¶best_groups
¶groups
¶Returns the groups for this datamodel.
parameters
¶marvin.utils.datamodel.query.base.
QueryDataModelList
(models=None)[source]¶Bases: marvin.utils.datamodel.DataModelList
A dictionary of Query datamodels.
base_model
¶alias of QueryDataModel
base
= {'QueryDataModel': <class 'marvin.utils.datamodel.query.base.QueryDataModel'>}¶base_name
= 'QueryDataModel'¶marvin.utils.datamodel.query.base.
QueryFuzzyList
(the_list, use_fuzzy=None)[source]¶Bases: marvin.utils.general.structs.FuzzyList
Fuzzy List for Query Parameters
marvin.utils.datamodel.query.base.
QueryList
(items)[source]¶Bases: marvin.utils.datamodel.query.base.QueryFuzzyList
A class for a list of Query Parameters
marvin.utils.datamodel.query.base.
QueryParameter
(full, table=None, name=None, short=None, display=None, remote=None, dtype=None, **kwargs)[source]¶Bases: object
A Query Parameter class
An object representing a query parameter. Provides access to different names for a given parameter.
Parameters: |
|
---|---|
Variables: | property – The DAP Datamodel Property corresponding to this Query Parameter |
db_column
¶db_schema
¶db_table
¶marvin.utils.datamodel.query.base.
get_best_fuzzy
(name, choices, cutoff=60, return_score=False)[source]¶marvin.utils.datamodel.query.base.
strip_mapped
(self)[source]¶Strip the mapped items for display with dir
Since __dir__ cannot have . in the attribute name, this strips the returned mapper(item) parameter of any . in the name. Used for query parameter syntax [table.parameter_name]
For cases where the parameter_name is “name”, and thus non-unique, it also returns the mapper name with “.” replaced with “_”, to make unique. “ifu.name” becomes “ifu_name”, etc.
Parameters: | self – a QueryFuzzyList object |
---|---|
Returns: | list of mapped named stripped of dots |
marvin.tools.results.
Results
(*args, **kwargs)[source]¶Bases: object
A class to handle results from queries on the MaNGA dataset
Parameters: |
|
---|---|
Variables: | |
Returns: | results – An object representing the Results entity |
Example
>>> f = 'nsa.z < 0.012 and ifu.name = 19*'
>>> q = Query(searchfilter=f)
>>> r = q.run()
>>> print(r)
>>> Results(results=[(u'4-3602', u'1902', -9999.0), (u'4-3862', u'1902', -9999.0), (u'4-3293', u'1901', -9999.0), (u'4-3988', u'1901', -9999.0), (u'4-4602', u'1901', -9999.0)],
>>> query=<sqlalchemy.orm.query.Query object at 0x115217090>,
>>> count=64,
>>> mode=local)
convertToTool
(tooltype, **kwargs)[source]¶Converts the list of results into Marvin Tool objects
Creates a list of Marvin Tool objects from a set of query results. The new list is stored in the Results.objects property. If the Query.returntype parameter is specified, then the Results object will automatically convert the results to the desired Tool on initialization.
Parameters: |
|
---|
Example
>>> # Get the results from some query
>>> r = q.run()
>>> r.results
>>> [NamedTuple(mangaid=u'14-12', name=u'1901', nsa.z=-9999.0),
>>> NamedTuple(mangaid=u'14-13', name=u'1902', nsa.z=-9999.0),
>>> NamedTuple(mangaid=u'27-134', name=u'1901', nsa.z=-9999.0),
>>> NamedTuple(mangaid=u'27-100', name=u'1902', nsa.z=-9999.0),
>>> NamedTuple(mangaid=u'27-762', name=u'1901', nsa.z=-9999.0)]
>>> # convert results to Marvin Cube tools
>>> r.convertToTool('cube')
>>> r.objects
>>> [<Marvin Cube (plateifu='7444-1901', mode='remote', data_origin='api')>,
>>> <Marvin Cube (plateifu='7444-1902', mode='remote', data_origin='api')>,
>>> <Marvin Cube (plateifu='7995-1901', mode='remote', data_origin='api')>,
>>> <Marvin Cube (plateifu='7995-1902', mode='remote', data_origin='api')>,
>>> <Marvin Cube (plateifu='8000-1901', mode='remote', data_origin='api')>]
download
(images=False, limit=None)[source]¶Download results via sdss_access
Uses sdss_access to download the query results via rsync. Downloads them to the local sas. The data type downloaded is indicated by the returntype parameter
i.e. $SAS_BASE_DIR/mangawork/manga/spectro/redux/…
Parameters: | |
---|---|
Returns: | NA – na |
Example
>>> r = q.run()
>>> r.returntype = 'cube'
>>> r.download()
extendSet
(chunk=None, start=None)[source]¶Extend the Result set with the next page
Extends the current ResultSet with the next page of results or a specified page. Calls either getNext or getSubset.
Parameters: | |
---|---|
Returns: | A new results set |
Example
>>> # run a query
>>> r = q.run()
>>> # extend the current result set with the next page
>>> r.extendSet()
>>>
See also
getNext, getSubset
getAll
(force=False)[source]¶Retrieve all of the results of a query
Attempts to return all the results of a query. The efficiency of this method depends heavily on how many rows and columns you wish to return.
A cutoff limit is applied for results with more than 500,000 rows or results with more than 25 columns.
Returns: | The full list of query results. |
---|
See also
getNext, getPrevious, getSubset, loop
getColumns
()[source]¶Get the columns of the returned reults
Returns a ParameterGroup containing the columns from the returned results. Each row of the ParameterGroup is a QueryParameter.
Returns: | columns (list) – A list of column names from the results |
---|
Example
>>> r = q.run()
>>> cols = r.getColumns()
>>> print(cols)
>>> [u'mangaid', u'name', u'nsa.z']
getDictOf
(name=None, format_type='listdict', to_json=False, return_all=None)[source]¶Get a dictionary of specified parameters
Parameters: |
|
---|---|
Returns: | output (list, dict) – Can be either a list of dictionaries, or a dictionary of lists |
Example
>>> # get some results
>>> r = q.run()
>>> # Get a list of dictionaries
>>> r.getDictOf(format_type='listdict')
>>> [{'cube.mangaid': u'4-3988', 'ifu.name': u'1901', 'nsa.z': -9999.0},
>>> {'cube.mangaid': u'4-3862', 'ifu.name': u'1902', 'nsa.z': -9999.0},
>>> {'cube.mangaid': u'4-3293', 'ifu.name': u'1901', 'nsa.z': -9999.0},
>>> {'cube.mangaid': u'4-3602', 'ifu.name': u'1902', 'nsa.z': -9999.0},
>>> {'cube.mangaid': u'4-4602', 'ifu.name': u'1901', 'nsa.z': -9999.0}]
>>> # Get a dictionary of lists
>>> r.getDictOf(format_type='dictlist')
>>> {'cube.mangaid': [u'4-3988', u'4-3862', u'4-3293', u'4-3602', u'4-4602'],
>>> 'ifu.name': [u'1901', u'1902', u'1901', u'1902', u'1901'],
>>> 'nsa.z': [-9999.0, -9999.0, -9999.0, -9999.0, -9999.0]}
>>> # Get a dictionary of only one parameter
>>> r.getDictOf('mangaid')
>>> [{'cube.mangaid': u'4-3988'},
>>> {'cube.mangaid': u'4-3862'},
>>> {'cube.mangaid': u'4-3293'},
>>> {'cube.mangaid': u'4-3602'},
>>> {'cube.mangaid': u'4-4602'}]
getListOf
(name=None, to_json=False, to_ndarray=False, return_all=None)[source]¶Extract a list of a single parameter from results
Parameters: |
|
---|---|
Returns: | output (list) – A list of results for one parameter |
Example
>>> r = q.run()
>>> r.getListOf('mangaid')
>>> [u'4-3988', u'4-3862', u'4-3293', u'4-3602', u'4-4602']
Raises: | AssertionError – Raised when no name is specified. |
---|
getNext
(chunk=None)[source]¶Retrieve the next chunk of results
Returns the next chunk of results from the query. from start to end in units of chunk. Used with getPrevious to paginate through a long list of results
Parameters: | chunk (int) – The number of objects to return |
---|---|
Returns: | results (list) – A list of query results |
Example
>>> r = q.run()
>>> r.getNext(5)
>>> Retrieving next 5, from 35 to 40
>>> [(u'4-4231', u'1902', -9999.0),
>>> (u'4-14340', u'1901', -9999.0),
>>> (u'4-14510', u'1902', -9999.0),
>>> (u'4-13634', u'1901', -9999.0),
>>> (u'4-13538', u'1902', -9999.0)]
See also
getAll, getPrevious, getSubset
getPrevious
(chunk=None)[source]¶Retrieve the previous chunk of results.
Returns a previous chunk of results from the query. from start to end in units of chunk. Used with getNext to paginate through a long list of results
Parameters: | chunk (int) – The number of objects to return |
---|---|
Returns: | results (list) – A list of query results |
Example
>>> r = q.run()
>>> r.getPrevious(5)
>>> Retrieving previous 5, from 30 to 35
>>> [(u'4-3988', u'1901', -9999.0),
>>> (u'4-3862', u'1902', -9999.0),
>>> (u'4-3293', u'1901', -9999.0),
>>> (u'4-3602', u'1902', -9999.0),
>>> (u'4-4602', u'1901', -9999.0)]
See also
getNext, getAll, getSubset
getSubset
(start, limit=None)[source]¶Extracts a subset of results
Parameters: | |
---|---|
Returns: | results (list) – A list of query results |
Example
>>> r = q.run()
>>> r.getSubset(0, 10)
>>> [(u'14-12', u'1901', -9999.0),
>>> (u'14-13', u'1902', -9999.0),
>>> (u'27-134', u'1901', -9999.0),
>>> (u'27-100', u'1902', -9999.0),
>>> (u'27-762', u'1901', -9999.0),
>>> (u'27-759', u'1902', -9999.0),
>>> (u'27-827', u'1901', -9999.0),
>>> (u'27-828', u'1902', -9999.0),
>>> (u'27-1170', u'1901', -9999.0),
>>> (u'27-1167', u'1902', -9999.0)]
See also
getNext, getPrevious, getAll
hist
(name, **kwargs)[source]¶Make a histogram for a given column of the results
Creates a Matplotlib histogram from a Results Column. Accepts as input a string column name. Will extract the total entire column (if not already available) and plot it.
See marvin.utils.plot.scatter.hist()
for details.
Parameters: |
|
---|---|
Returns: | The histogram data, figure, and axes from the plotting function |
Example
>>> # do a query and get the results
>>> q = Query(searchfilter='nsa.z < 0.1', returnparams=['nsa.elpetro_ba', 'g_r'])
>>> r = q.run()
>>> # plot a histogram of the redshift column
>>> hist_data, fig, axes = r.hist('nsa.z')
loop
(chunk=None)[source]¶Loop over the full set of results
Starts a loop to collect all the results (in chunks) until the current count reaches the total number of results. Uses extendSet.
Parameters: | chunk (int) – The number of objects to return |
---|
Example
>>> # get some results from a query
>>> r = q.run()
>>> # start a loop, grabbing in chunks of 400
>>> r.loop(chunk=400)
merge_tables
(tables, direction='vert', **kwargs)[source]¶Merges a list of Astropy tables of results together
Combines two Astropy tables using either the Astropy vstack or hstack method. vstack refers to vertical stacking of table rows. hstack refers to horizonal stacking of table columns. hstack assumes the rows in each table refer to the same object. Buyer beware: stacking tables without proper understanding of your rows and columns may results in deleterious results.
merge_tables also accepts all keyword arguments that Astropy vstack and hstack method do. See vstack See hstack
Parameters: | |
---|---|
Returns: | A new Astropy table that is the stacked combination of all input tables |
Example
>>> # query 1
>>> q, r = doQuery(searchfilter='nsa.z < 0.1', returnparams=['g_r', 'cube.ra', 'cube.dec'])
>>> # query 2
>>> q2, r2 = doQuery(searchfilter='nsa.z < 0.1')
>>>
>>> # convert to tables
>>> table_1 = r.toTable()
>>> table_2 = r2.toTable()
>>> tables = [table_1, table_2]
>>>
>>> # vertical (row) stacking
>>> r.merge_tables(tables, direction='vert')
>>> # horizontal (column) stacking
>>> r.merge_tables(tables, direction='hor')
plot
(x_name, y_name, **kwargs)[source]¶Make a scatter plot from two columns of results
Creates a Matplotlib scatter plot from Results columns. Accepts as input two string column names. Will extract the total entire column (if not already available) and plot them. Creates a scatter plot with (optionally) adjoining 1-d histograms for each column.
See marvin.utils.plot.scatter.plot()
and
marvin.utils.plot.scatter.hist()
for details.
Parameters: |
|
---|---|
Returns: | The figure, axes, and histogram data from the plotting function |
Example
>>> # do a query and get the results
>>> q = Query(searchfilter='nsa.z < 0.1', returnparams=['nsa.elpetro_ba', 'g_r'])
>>> r = q.run()
>>> # plot the total columns of Redshift vs g-r magnitude
>>> fig, axes, hist_data = r.plot('nsa.z', 'g_r')
restore
(path, delete=False)[source]¶Restore a pickled Results object
Parameters: | |
---|---|
Returns: | Results (instance) – The instantiated Marvin Results class |
save
(path=None, overwrite=False)[source]¶Save the results as a pickle object
Parameters: | |
---|---|
Returns: | path (str) – The filepath and name of the pickled object |
showQuery
()[source]¶Displays the literal SQL query used to generate the Results objects
Returns: | querystring (str) – A string representation of the SQL query |
---|
sort
(name, order='asc')[source]¶Sort the set of results by column name
Sorts the results (in place) by a given parameter / column name. Sets the results to the new sorted results.
Parameters: |
|
---|
Example
>>> r = q.run()
>>> r.getColumns()
>>> [u'mangaid', u'name', u'nsa.z']
>>> r.results
>>> [(u'4-3988', u'1901', -9999.0),
>>> (u'4-3862', u'1902', -9999.0),
>>> (u'4-3293', u'1901', -9999.0),
>>> (u'4-3602', u'1902', -9999.0),
>>> (u'4-4602', u'1901', -9999.0)]
>>> # Sort the results by mangaid
>>> r.sort('mangaid')
>>> [(u'4-3293', u'1901', -9999.0),
>>> (u'4-3602', u'1902', -9999.0),
>>> (u'4-3862', u'1902', -9999.0),
>>> (u'4-3988', u'1901', -9999.0),
>>> (u'4-4602', u'1901', -9999.0)]
>>> # Sort the results by IFU name in descending order
>>> r.sort('ifu.name', order='desc')
>>> [(u'4-3602', u'1902', -9999.0),
>>> (u'4-3862', u'1902', -9999.0),
>>> (u'4-3293', u'1901', -9999.0),
>>> (u'4-3988', u'1901', -9999.0),
>>> (u'4-4602', u'1901', -9999.0)]
toCSV
(filename='myresults.csv', overwrite=False)[source]¶Output the results as a CSV file
Writes a new CSV file from search results using the astropy Table.write()
Parameters: |
---|
toDataFrame
()[source]¶Output the results as an pandas dataframe.
Uses the pandas package.
Parameters: | None – |
---|---|
Returns: | dfres – pandas dataframe |
Example
>>> r = q.run()
>>> r.toDataFrame()
mangaid plate name nsa_mstar z
0 1-22286 7992 12704 1.702470e+11 0.099954
1 1-22301 7992 6101 9.369260e+10 0.105153
2 1-22414 7992 6103 7.489660e+10 0.092272
3 1-22942 7992 12705 8.470360e+10 0.104958
4 1-22948 7992 9102 1.023530e+11 0.119399
toFits
(filename='myresults.fits', overwrite=False)[source]¶Output the results as a FITS file
Writes a new FITS file from search results using the astropy Table.write()
Parameters: |
---|
toJson
()[source]¶Output the results as a JSON object
Uses Python json package to convert the results to JSON representation
Parameters: | None – |
---|---|
Returns: | jsonres – JSONed results |
Example
>>> r = q.run()
>>> r.toJson()
>>> '[["4-3602", "1902", -9999.0], ["4-3862", "1902", -9999.0], ["4-3293", "1901", -9999.0],
>>> ["4-3988", "1901", -9999.0], ["4-4602", "1901", -9999.0]]'
toTable
()[source]¶Output the results as an Astropy Table
Uses the Python Astropy package
Parameters: | None – |
---|---|
Returns: | tableres – Astropy Table |
Example
>>> r = q.run()
>>> r.toTable()
>>> <Table length=5>
>>> mangaid name nsa.z
>>> unicode6 unicode4 float64
>>> -------- -------- ------------
>>> 4-3602 1902 -9999.0
>>> 4-3862 1902 -9999.0
>>> 4-3293 1901 -9999.0
>>> 4-3988 1901 -9999.0
>>> 4-4602 1901 -9999.0
marvin.tools.results.
ResultSet
(_objects, **kwargs)[source]¶Bases: list
A Set of Results
A list object representing a set of query results. Each row of the list is a ResultRow object, which is a custom Marvin namedtuple object. ResultSets can be extended column-wise or row-wise by adding them together.
Parameters: |
|
---|
sort
(name=None, reverse=False)[source]¶Sort the results
In-place sorting of the result set. This is the standard list sorting mechanism. When no name is specified, does standard list sorting with no key.
Parameters: | |
---|---|
Returns: | A sorted list |