Faites-nous part de vos impressions sur le Serveur terminologique, et aidez-nous à améliorer nos services! Vous avez jusqu’au 3 décembre 2024 pour répondre au sondage. Votre avis nous intéresse! En savoir plus >

Partager :

question-circle Found issue with space encoding in Description file in RF2 for SNOMED CT release 20210331

  • Messages : 50
il y a 3 ans 2 mois #7121 par Anibal Jodorcovsky
OK, so what does that mean exactly? I'm new to this, so please bear with me. I'm assuming your answer is telling me that somebody within CHI is then responsible for fixing this?

Connexion ou Créer un compte pour participer à la conversation.

  • Messages : 13
il y a 3 ans 2 mois #7120 par Jon Zammit
Hi Anibal,

The descriptions you have high-lighted in your screen shot are part of the Canadian extension. You can determine that based on the moduleId attribute which in this case is 20611000087101 |Canada Health Infoway French module (core metadata concept)|.

I hope that helps.

Regards,

Jon Zammit

Connexion ou Créer un compte pour participer à la conversation.

  • Messages : 50
il y a 3 ans 2 mois #7119 par Anibal Jodorcovsky
Hi all,

Not sure if this is the right place to post this, but given the group description I thought it'd be worth a shot.

I'm trying to automate a whole bunch of tasks that were done by hand previously within our group.

To this end, I’m writing several scripts and SQL against a MS Access DB that houses the RF2 SNOMED CT CAD release.

One of our tools is not working as expected when doing comparisons and after a lot of digging, I discovered that the source files within the RF2 release are encoding the space between words differently in some cases.

The file in question is this:

C:\Users\aniba\Desktop\SnomedCT_Canadian_EditionRelease_PRODUCTION_20210331T120000Z\Full\Terminology sct2_Description_Full_CanadianEdition_20210331.txt

See attachment - taken from a Sublime text capture - where we can see the issue [hmmm, I can't find a way to upload an attachment to a topic].

I uploaded the screenshot to a public google folder, here it is:

drive.google.com/file/d/1XtIPvxFXQNHFPyHdzIlwXRxNsKFQ7lFI/view?usp=sharing

That’s the screen where I’m seeing “hidden” characters in some of the terms.

Notice how the space between several terms is encoded as <0xa0> rather than <0x20> as it should be and all other terms are.

<0xa0> is part of the extended ASCII char set and it should not be used in txt files like this, in particular when we need to be consistent. So, we either use <0xa0> for all spaces, or <0x20>.

This is causing our tools to break and are unable to compare terms automatically.

Is this something that comes from SNOMED International or is this something that comes from CHI?

Connexion ou Créer un compte pour participer à la conversation.

Modérateurs: Linda MonicoNaomi BrooksHelen Wu

Logo d'InfoCentral

La santé numérique à votre service

 

Transformer les soins de santé au Canada grâce aux technologies de l'information sur la santé.