How to use NLP to compare text from two different corpora?

📰

How to use NLP to compare text from two different corpora?

Data Science·/u/iwannabeunknown3·about 1 month ago

#reddit #x200b #observations #incidents #want #compare

Reading 0:00

15s threshold

 I am not well versed in NLP, so hopefully someone can help me out here. I am looking at safety incidents for my organization. I want to compare the text of incident reports and observations to investigate if our observations are deterring incidents. I have a dataset of the incidents and a dataset of the observations. Both datasets have a free-text field that contains the description of the incident or observation. There is not really a good link between observations and incidents (as in, these observations were monitoring X activity on Y contract, and an incident also occurred during X activity on Y contract). My feeling is that the observations are just busy work; they don’t actually observe the activities that need safety improvement. The correlation between number of observations and number of incidents is minor, but I want to make a stronger case. I want to investigate this by using NLP to describe the incidents, then describe the observations, and see if there is a difference in content.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

How to use NLP to compare text from two different corpora?