parse_formula

バージョン 0.1.0 (21.7 KB) 作成者: phenan08
A chemical formula string parser for MATLAB
ダウンロード: 19
更新 2022/9/15

parse_formula

A chemical formula string parser for MATLAB

Description

parse_formula is a string parser designed for MATLAB and that permits to convert linear or condensed chemical formulas into the corresponding raw chemical formulas. The script offers additional features, such as element counting, average molecular weight and monoisotopic mass calculation.

The script can be used to generate proper inputs for other scripts like isoDalton_exact_mass (see DOI: 10.1016/j.jasms.2007.05.016).

For now, the script does not support isotope labeling. Consequently, the molecular weights are not calculated when the input formula contains unstable elements.

Syntax

The general syntax is [a,b,c,d] = parse_formula(input), where:

  • input is a scalar string containg the formula to parse (e.g. "CH3CH2CH2CH3"),
  • a is the raw chemical formula returned after parsing as a scalar string,
  • b a structure containing two fields, namely element and count, which associated every entry to a chemical element symbol and the corresponding count in the provided formula,
  • c is the average molecular weight calculated for the compound, returned as a scalar double,
  • d is the monoisotopic mass of the compound, return as a scalar double as well.

Examples

parse_formula("CH3CH2CH2CH3") will produce the following output:

ans = 

    "C4H10"

Strings containing brackets can also be provided as inputs:

>> parse_formula("CH3(CH2)2CH3")

    ans = 

    "C4H10"

... as well as strings containing nested brackets:

>> parse_formula("CH3(C(CH2)2)2CH3")

ans = 

    "C8H14"

To obtain the list of the different elements in the provided formula with their respective counts, the following command can be used:

>> [~,counts] = parse_formula("CH3(C(CH2)2)2CH3")

counts = 

  2×1 struct array with fields:

    element
    count

The different entries generated each contain the symbol of the corresponding element (scalar string in field element) and the element counts (scalar double in field count):

>> counts(1).element, counts(1).count

ans = 

    "C"


ans =

     8

To calculate the average molecular weight and the monoisotopic mass, supplementary output argument must be required:

>>  [formula,counts,MW,monoisotopic_mass] = parse_formula("CH3(C(CH2)2)2CH3")

formula = 

    "C8H14"


counts = 

  2×1 struct array with fields:

    element
    count


MW =

  110.1971


monoisotopic_mass =

  110.1096

Errors to avoid

If non valid elements are provided in the input string, the script will return an error:

>> parse_formula("CH3(Cn)2CH3")
Error using parse_formula
Cn is not a valid chemical element.

If elements having no stable isotope are provided in the input string, the script will return a warning message (without stopping code execution). In this case, the c and d output arguments will be returned as empty doubles.

>> [a,b,c,d] = parse_formula("Pu(C2O4)2(H2O)6")
Warning: At least one element of the string provided is unstable. The molecular weight and monoisotopic mass will not be
calculated. 
> In parse_formula (line 97) 

a = 

    "C4H12O14Pu1"


b = 

  4×1 struct array with fields:

    element
    count


c =

     []


d =

     []

Future implementations

The following functionalities will be implemented in future versions:

  • the possibility to use abbreviations (e.g. Me, Et, Pr, Bu, Ph, Cy...) for radicals in the input string,
  • the possibility to specify isotopes in the input string (e.g. [13]C4H10, [13]C1C3H10...).

引用

phenan08 (2024). parse_formula (https://github.com/burelant/parse_formula), GitHub. 取得済み .

MATLAB リリースの互換性
作成: R2022b
すべてのリリースと互換性あり
プラットフォームの互換性
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

GitHub の既定のブランチを使用するバージョンはダウンロードできません

バージョン 公開済み リリース ノート
0.1.0

この GitHub アドオンでの問題を表示または報告するには、GitHub リポジトリにアクセスしてください。
この GitHub アドオンでの問題を表示または報告するには、GitHub リポジトリにアクセスしてください。