castep_outputs.utilities.utility#

Utility functions for parsing castep outputs.

Functions

add_aliases(in_dict, alias_dict, *[, ...])

Add aliases of known names into dictionary.

atreg_to_index(dict_in, *[, clear])

Transform a matched atreg value to species index tuple.

determine_type(data)

Determine the datatype and return the appropriate type.

file_or_path(*, mode, **open_kwargs)

Decorate to allow a parser to accept either a path or open file.

fix_data_types(in_dict, type_dict)

Apply correct types to elements of in_dict by mapping given in type_dict.

flatten_dict(dictionary[, parent_key, separator])

Turn a nested dictionary into a flattened dictionary.

get_only(seq)

Get the only element of a Sequence ensuring uniqueness.

json_safe(obj)

Recursively transform datatypes into JSON safe variants.

log_factory(file)

Return logging function to add file info to logs.

normalise(obj, mapping)

Standardise data after processing.

normalise_key(string)

Normalise a dictionary key.

normalise_string(string)

Normalise a string.

parse_int_or_float(numbers)

Parse numbers to int if all elements ints or float otherwise.

stack_dict(out_dict, in_dict)

Append items in in_dict to the keys in out_dict.

strip_comments(data, *[, comment_char, ...])

Strip comments from data.

to_type(data_in, _typ)

Convert types to typ regardless of if data_in is iterable or otherwise.

castep_outputs.utilities.utility.add_aliases(in_dict, alias_dict, *, replace=False, inplace=True)[source]#

Add aliases of known names into dictionary.

If replace is True, this will remove the original.

Parameters:
  • in_dict (dict[str, Any]) – Dictionary of data to alias.

  • alias_dict (dict[str, str]) – Mapping of from->to for keys in in_dict.

  • replace (bool) – Whether to remove the from key from in_dict.

  • inplace (bool) – Whether to return a copy or overwrite in_dict.

Returns:

in_dict with keys substituted.

Return type:

dict[str, Any]

Examples

>>> add_aliases({'hi': 1, 'bye': 2}, {'hi': 'frog'})
{'hi': 1, 'bye': 2, 'frog': 1}
>>> add_aliases({'hi': 1, 'bye': 2}, {'hi': 'frog'}, replace=True)
{'bye': 2, 'frog': 1}
castep_outputs.utilities.utility.atreg_to_index(dict_in, *, clear=True)[source]#

Transform a matched atreg value to species index tuple.

Optionally clear value from dictionary for easier processing.

Parameters:
  • dict_in (dict[str, str] | Match) – Atreg to process.

  • clear (bool) – Whether to remove from incoming dictionary.

Return type:

tuple[str, int]

Returns:

  • species (str) – Atomic species.

  • ind (int) – Internal index.

Examples

>>> parsed_line = {'x': 3.1, 'y': 2.1, 'z': 1.0, 'spec': 'Ar', 'index': '1'}
>>> atreg_to_index(parsed_line, clear=False)
('Ar', 1)
>>> parsed_line
{'x': 3.1, 'y': 2.1, 'z': 1.0, 'spec': 'Ar', 'index': '1'}
>>> atreg_to_index(parsed_line)
('Ar', 1)
>>> parsed_line
{'x': 3.1, 'y': 2.1, 'z': 1.0}
castep_outputs.utilities.utility.determine_type(data)[source]#

Determine the datatype and return the appropriate type.

For dealing with miscellaneous data read from input files.

Parameters:

data (str) – String to process.

Returns:

Best type to attempt.

Return type:

type

Examples

>>> determine_type('T')
<class 'bool'>
>>> determine_type('False')
<class 'bool'>
>>> determine_type('3.1415')
<class 'float'>
>>> determine_type('123')
<class 'int'>
>>> determine_type('1/3')
<class 'float'>
>>> determine_type('BEEF')
<class 'str'>
castep_outputs.utilities.utility.file_or_path(*, mode, **open_kwargs)[source]#

Decorate to allow a parser to accept either a path or open file.

Parameters:

mode (Literal["r", "rb"]) – Open mode if passed a Path or str.

Return type:

Callable

castep_outputs.utilities.utility.fix_data_types(in_dict, type_dict)[source]#

Apply correct types to elements of in_dict by mapping given in type_dict.

Parameters:
  • in_dict (MutableMapping[str, Any]) – Dictionary of {key: values} to convert.

  • type_dict (dict[str, type]) – Mapping of keys to types the keys should be converted to.

Return type:

None

See also

to_type

Conversion function.

Notes

Modifies the dictionary in-place.

Examples

>>> my_dict = {"int": "7", "float": "3.141", "bool": "T",
...            "vector": ["3", "4", "5"], "blank": "Hello"}
>>> type_map = {"int": int, "float": float, "bool": bool, "vector": float}
>>> fix_data_types(my_dict, type_map)
>>> print(my_dict)
{'int': 7, 'float': 3.141, 'bool': True, 'vector': (3.0, 4.0, 5.0), 'blank': 'Hello'}
castep_outputs.utilities.utility.flatten_dict(dictionary, parent_key='', separator='_')[source]#

Turn a nested dictionary into a flattened dictionary.

Parameters:
  • dictionary (MutableMapping[Any, Any]) – The dictionary to flatten.

  • parent_key (str) – The string to prepend to dictionary’s keys.

  • separator (str) – The string used to separate flattened keys.

Returns:

A flattened dictionary.

Return type:

dict[str, Any]

Notes

Taken from: https://stackoverflow.com/a/62186053

Examples

>>> flatten_dict({'hello': ['is', 'me'],
...               "goodbye": {"nest": "birds", "child": "moon"}})
{'hello_0': 'is', 'hello_1': 'me', 'goodbye_nest': 'birds', 'goodbye_child': 'moon'}
castep_outputs.utilities.utility.get_only(seq)[source]#

Get the only element of a Sequence ensuring uniqueness.

Parameters:

seq (Sequence[TypeVar(T)]) – Sequence of one element.

Returns:

The sole element of the sequence.

Return type:

Any

Raises:

ValueError – Value is not alone.

castep_outputs.utilities.utility.json_safe(obj)[source]#

Recursively transform datatypes into JSON safe variants.

Including:

  • Ensuring dict keys are strings without spaces.

  • Ensuring complex numbers are split into real/imag components.

Parameters:

obj (Union[dict, complex, TypeVar(T)]) – Incoming datatype.

Returns:

Safe datatype.

Return type:

Any

Examples

>>> json_safe(3 + 4j)
{'real': 3.0, 'imag': 4.0}
>>> json_safe({('Ar', 'Sr'): 3})
{'Ar_Sr': 3}
>>> json_safe({(('Ar', 1), ('Sr', 1)): 3})
{'Ar_1_Sr_1': 3}
castep_outputs.utilities.utility.log_factory(file)[source]#

Return logging function to add file info to logs.

Parameters:

file (TextIO or FileInput or FileWrapper) – File to apply logging for.

Returns:

Function for logging data.

Return type:

Callable

castep_outputs.utilities.utility.normalise(obj, mapping)[source]#

Standardise data after processing.

Recursively converts:

  • list s to tuple s

  • defaultdict s to dict s

  • types in mapping to their mapped type or apply mapped function.

Parameters:
  • obj (TypeVar(T)) – Object to normalise.

  • mapping (dict[type, type | Callable]) – Mapping of type to a callable transformation including class constructors.

Returns:

Normmalised data.

Return type:

Any

castep_outputs.utilities.utility.normalise_key(string)[source]#

Normalise a dictionary key.

This includes:

  • Removing all punctuation.

  • Lower-casing all.

  • Making all spacing single-underscore.

Parameters:

string (str) – String to process.

Returns:

Normalised string.

Return type:

str

Examples

>>> normalise_key(" Several   words  ")
'several_words'
>>> normalise_key("A sentence.")
'a_sentence'
>>> normalise_key("I<3;;semi-colons;;!!!")
'i_3_semi_colons'
castep_outputs.utilities.utility.normalise_string(string)[source]#

Normalise a string.

This includes:

  • Removing leading/trailing whitespace.

  • Making all spacing single-space.

Parameters:

string (str) – String to process.

Returns:

Normalised string.

Return type:

str

Examples

>>> normalise_string(" Several   words  ")
'Several words'
castep_outputs.utilities.utility.parse_int_or_float(numbers)[source]#

Parse numbers to int if all elements ints or float otherwise.

Parameters:

numbers (Iterable[str]) – Sequence of numbers to parse.

Returns:

Parsed numerical value.

Return type:

int or float

Examples

>>> parse_int_or_float("3.141")
3.141
>>> parse_int_or_float("7")
7
castep_outputs.utilities.utility.stack_dict(out_dict, in_dict)[source]#

Append items in in_dict to the keys in out_dict.

Parameters:
Return type:

None

castep_outputs.utilities.utility.strip_comments(data, *, comment_char='#!', remove_inline=False)[source]#

Strip comments from data.

Parameters:
  • data (TextIO | FileWrapper | Block) – Data to strip comments from.

  • remove_inline (bool) – Whether to remove inline comments or just line initial.

  • comment_char (str | set[str]) –

    Character sets to read as comments and remove.

    Note

    If the chars are passed as a string, it is assumed that each character is a comment character.

    To match a multicharacter comment you must pass this as a set or sequence of strings.

Returns:

Block of data without comments.

Return type:

Block

Notes

Also strips trailing, but not leading whitespace to clean up comment blocks.

Also strips empty lines.

Examples

>>> from io import StringIO
>>> inp = StringIO('''
... Hello
... # Initial line comment
... End of line # comment
... // C-style
... ''')
>>> x = strip_comments(inp, remove_inline=False)
>>> type(x).__name__
'Block'
>>> '|'.join(x)
'Hello|End of line # comment|// C-style'
>>> _ = inp.seek(0)
>>> x = strip_comments(inp, remove_inline=True)
>>> '|'.join(x)
'Hello|End of line|// C-style'
>>> _ = inp.seek(0)
>>> x = strip_comments(inp, comment_char={"//", "#"})
>>> '|'.join(x)
'Hello|End of line # comment'
castep_outputs.utilities.utility.to_type(data_in, _typ)[source]#

Convert types to typ regardless of if data_in is iterable or otherwise.

Parameters:
  • data_in (str or Sequence) – Data to convert.

  • typ (type) – Type to convert to.

Returns:

Converted data.

Return type:

typ or tuple[typ, …]