castep_outputs.utilities.utility#

Utility functions for parsing castep outputs.

Functions

add_aliases(in_dict, alias_dict, *[, ...])

Add aliases of known names into dictionary.

atreg_to_index(dict_in, *[, clear])

Transform a matched atreg value to species index tuple.

file_or_path(*, mode, **open_kwargs)

Decorate to allow a parser to accept either a path or open file.

filter_underscore(x, /)

Remove underscored keys from dict.

flatten_dict(dictionary[, parent_key, separator])

Turn a nested dictionary into a flattened dictionary.

get_only(seq)

Get the only element of a Sequence ensuring uniqueness.

json_safe()

Recursively transform datatypes into JSON safe variants.

log_factory()

Return logging function to add file info to logs.

normalise()

Standardise data after processing.

normalise_key(string)

Normalise a dictionary key.

normalise_string(string)

Normalise a string.

stack_dict(out_dict, in_dict)

Append items in in_dict to the keys in out_dict.

strip_comments(data, *[, comment_char, ...])

Strip comments from data.

strip_nones(data, *[, include, exclude])

Strip None from datasets.

Classes

ComplexDict

Dict of complex values.

Logger(*args, **kwargs)

Protocol for logging classes.

class castep_outputs.utilities.utility.ComplexDict[source]#

Bases: TypedDict

Dict of complex values.

class castep_outputs.utilities.utility.Logger(*args, **kwargs)[source]#

Bases: Protocol

Protocol for logging classes.

__call__(message, *args, level='info')[source]#

Call method for logging methods.

Parameters:
  • message (str)

  • args (Any)

  • level (Literal['debug', 'info', 'warning', 'error', 'critical'])

Return type:

None

castep_outputs.utilities.utility.add_aliases(in_dict, alias_dict, *, replace=False, inplace=True)[source]#

Add aliases of known names into dictionary.

If replace is True, this will remove the original.

Parameters:
  • in_dict (dict[str, Any]) – Dictionary of data to alias.

  • alias_dict (dict[str, str]) – Mapping of from->to for keys in in_dict.

  • replace (bool) – Whether to remove the from key from in_dict.

  • inplace (bool) – Whether to return a copy or overwrite in_dict.

Returns:

in_dict with keys substituted.

Return type:

dict[str, Any]

Examples

>>> add_aliases({'hi': 1, 'bye': 2}, {'hi': 'frog'})
{'hi': 1, 'bye': 2, 'frog': 1}
>>> add_aliases({'hi': 1, 'bye': 2}, {'hi': 'frog'}, replace=True)
{'bye': 2, 'frog': 1}
castep_outputs.utilities.utility.atreg_to_index(dict_in, *, clear=True)[source]#

Transform a matched atreg value to species index tuple.

Optionally clear value from dictionary for easier processing.

Parameters:
  • dict_in (dict[str, str] | Match) – Atreg to process.

  • clear (bool) – Whether to remove from incoming dictionary.

Returns:

  • str – Atomic species.

  • int – Internal index.

Return type:

tuple[str, int]

Examples

>>> parsed_line = {'x': 3.1, 'y': 2.1, 'z': 1.0, 'spec': 'Ar', 'index': '1'}
>>> atreg_to_index(parsed_line, clear=False)
('Ar', 1)
>>> parsed_line
{'x': 3.1, 'y': 2.1, 'z': 1.0, 'spec': 'Ar', 'index': '1'}
>>> atreg_to_index(parsed_line)
('Ar', 1)
>>> parsed_line
{'x': 3.1, 'y': 2.1, 'z': 1.0}
castep_outputs.utilities.utility.file_or_path(*, mode, **open_kwargs)[source]#

Decorate to allow a parser to accept either a path or open file.

Parameters:
  • mode (Literal['r', 'rb']) – Open mode if passed a Path or str.

  • open_kwargs (Any)

Returns:

Wrapped function able to handle open files or paths invisibly.

Return type:

Callable

castep_outputs.utilities.utility.filter_underscore(x, /)[source]#

Remove underscored keys from dict.

Parameters:

x (Mapping[str, T]) – Mapping to filter.

Returns:

New dict with “_”-prefixed values removed.

Return type:

dict[str, T]

Examples

>>> x = {"_val": 1, "other": 2}
>>> filter_underscore(x)
{'other': 2}
castep_outputs.utilities.utility.flatten_dict(dictionary, parent_key='', separator='_')[source]#

Turn a nested dictionary into a flattened dictionary.

Parameters:
  • dictionary (MutableMapping[Any, Any]) – The dictionary to flatten.

  • parent_key (str) – The string to prepend to dictionary’s keys.

  • separator (str) – The string used to separate flattened keys.

Returns:

A flattened dictionary.

Return type:

dict[str, Any]

Notes

Taken from: https://stackoverflow.com/a/62186053

Examples

>>> flatten_dict({'hello': ['is', 'me'],
...               "goodbye": {"nest": "birds", "child": "moon"}})
{'hello_0': 'is', 'hello_1': 'me', 'goodbye_nest': 'birds', 'goodbye_child': 'moon'}
castep_outputs.utilities.utility.get_only(seq)[source]#

Get the only element of a Sequence ensuring uniqueness.

Parameters:

seq (Sequence[T]) – Sequence of one element.

Returns:

The sole element of the sequence.

Raises:

ValueError – Value is not alone.

Return type:

T

castep_outputs.utilities.utility.json_safe(obj: complex) ComplexDict[source]#
castep_outputs.utilities.utility.json_safe(obj: T) T
castep_outputs.utilities.utility.json_safe(obj: dict[Any, T]) dict[str, T]

Recursively transform datatypes into JSON safe variants.

Including:

  • Ensuring dict keys are strings without spaces.

  • Ensuring complex numbers are split into real/imag components.

Parameters:

obj – Incoming datatype.

Returns:

Safe datatype.

Examples

>>> json_safe(3 + 4j)
{'real': 3.0, 'imag': 4.0}
>>> json_safe({('Ar', 'Sr'): 3})
{'Ar_Sr': 3}
>>> json_safe({(('Ar', 1), ('Sr', 1)): 3})
{'Ar_1_Sr_1': 3}
castep_outputs.utilities.utility.log_factory(file)[source]#
castep_outputs.utilities.utility.log_factory(file)
castep_outputs.utilities.utility.log_factory(file)
castep_outputs.utilities.utility.log_factory(file)

Return logging function to add file info to logs.

Parameters:

file (TextIO | FileInput | FileWrapper) – File to apply logging for.

Returns:

Function for logging data.

Return type:

Logger

castep_outputs.utilities.utility.normalise(obj: T, mapping: dict[type[In], type[Out] | Callable[[In], Out]]) T[source]#
castep_outputs.utilities.utility.normalise(obj: In, mapping: dict[type[In], type[Out] | Callable[[In], Out]]) Out
castep_outputs.utilities.utility.normalise(obj: Iterable[In | T], mapping: dict[type[In], type[Out] | Callable[[In], Out]]) tuple[Out | T, ...]
castep_outputs.utilities.utility.normalise(obj: Mapping[K, In | T], mapping: dict[type[In], type[Out] | Callable[[In], Out]]) dict[K, Out | T]

Standardise data after processing.

Recursively converts:

  • list s to tuple s

  • defaultdict s to dict s

  • types in mapping to their mapped type or apply mapped function.

Parameters:
  • obj – Object to normalise.

  • mapping – Mapping of type to a callable transformation including class constructors.

Returns:

Normalised data.

castep_outputs.utilities.utility.normalise_key(string)[source]#

Normalise a dictionary key.

This includes:

  • Removing all punctuation.

  • Lower-casing all.

  • Making all spacing single-underscore.

Parameters:

string (str) – String to process.

Returns:

Normalised string.

Return type:

str

Examples

>>> normalise_key(" Several   words  ")
'several_words'
>>> normalise_key("A sentence.")
'a_sentence'
>>> normalise_key("I<3;;semi-colons;;!!!")
'i_3_semi_colons'
castep_outputs.utilities.utility.normalise_string(string)[source]#

Normalise a string.

This includes:

  • Removing leading/trailing whitespace.

  • Making all spacing single-space.

Parameters:

string (str) – String to process.

Returns:

Normalised string.

Return type:

str

Examples

>>> normalise_string(" Several   words  ")
'Several words'
castep_outputs.utilities.utility.stack_dict(out_dict, in_dict)[source]#

Append items in in_dict to the keys in out_dict.

Parameters:
  • out_dict (Mapping[Any, list[T]]) – Dict to append to.

  • in_dict (Mapping[Any, T]) – Dict to append from.

Return type:

None

castep_outputs.utilities.utility.strip_comments(data, *, comment_char='#!', remove_inline=False)[source]#

Strip comments from data.

Parameters:
  • data (TextIO | FileWrapper | Block) – Data to strip comments from.

  • remove_inline (bool) – Whether to remove inline comments or just line initial.

  • comment_char (str | set[str]) –

    Character sets to read as comments and remove.

    Note

    If the chars are passed as a string, it is assumed that each character is a comment character.

    To match a multicharacter comment you must pass this as a set or sequence of strings.

Returns:

Block of data without comments.

Return type:

Block

Notes

Also strips trailing, but not leading whitespace to clean up comment blocks.

Also strips empty lines.

Examples

>>> from io import StringIO
>>> inp = StringIO('''
... Hello
... # Initial line comment
... End of line # comment
... // C-style
... ''')
>>> x = strip_comments(inp, remove_inline=False)
>>> type(x).__name__
'Block'
>>> '|'.join(x)
'Hello|End of line # comment|// C-style'
>>> _ = inp.seek(0)
>>> x = strip_comments(inp, remove_inline=True)
>>> '|'.join(x)
'Hello|End of line|// C-style'
>>> _ = inp.seek(0)
>>> x = strip_comments(inp, comment_char={"//", "#"})
>>> '|'.join(x)
'Hello|End of line # comment'
castep_outputs.utilities.utility.strip_nones(data, *, include=None, exclude=())[source]#

Strip None from datasets.

Parameters:
  • data (dict[K, Any]) – Dataset to filter.

  • include (Iterable[K] | None) – Values to include (or all if None)

  • exclude (Iterable[K]) – Keys/indices to ignore.

Return type:

None

Examples

>>> data = {"a": 1, "b": None, "c": None}
>>> strip_nones(data, exclude=("c",))
>>> data
{'a': 1, 'c': None}