Hi team,
I noticed that in the function normalize_smiles (at load.py#L41 ), the parameter isomericis set to False by default:
def normalize_smiles(smi, canonical=True, isomeric=False):
try:
normalized = Chem.MolToSmiles(
Chem.MolFromSmiles(smi),
canonical=canonical,
isomericSmiles=isomeric
)
except:
normalized = None
return normalized
As far as I understand, this means stereochemical information in SMILES (such as @, /, or ) will be lost during normalization.
I'd like to confirm:
-
Is the current SMI-TED model designed not to support SMILES with stereochemical information?
-
If that's the case, is there any plan or recommended way to handle isomeric SMILES (eg, enabling isomeric=Trueduring preprocessing)?
Thank you for your time and clarification!