medspacy.section_detection.section_rule
SectionRule
Bases: BaseRule
SectionRule defines rules for extracting entities from text using the Sectionizer.
Source code in medspacy/section_detection/section_rule.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |
__init__(literal, category, pattern=None, on_match=None, max_scope=None, parents=None, parent_required=False, metadata=None)
Class for defining rules for extracting entities from text using TargetMatcher.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
literal
|
str
|
The string representation of a concept. If |
required |
category
|
str
|
The semantic class of the matched span. This corresponds to the |
required |
pattern
|
Optional[Union[List[Dict[str, str]], str]]
|
A list or string to use as a spaCy pattern rather than |
None
|
on_match
|
Optional[Callable[[Matcher, Doc, int, List[Tuple[int, int, int]]], Any]]
|
An optional callback function or other callable which takes 4 arguments: |
None
|
max_scope
|
Optional[int]
|
A number of tokens to explicitly limit the size of a section body. If None, the scope will include the entire doc up until either the next section header or the end of the doc. This variable can also be set at a global level as `Sectionizer(nlp, max_scope=...), but if the attribute is set here, the rule scope will take precedence. If not None, this will be the number of tokens following the matched section header Example: In the text "Past Medical History: Pt has hx of pneumonia", SectionRule("Past Medical History:", "pmh", max_scope=None) will include the entire doc, but SectionRule("Past Medical History:", "pmh", max_scope=2) will limit the section to be "Past Medical History: Pt has" This can be useful for limiting certain sections which are known to be short or allowing others to be longer than the regular global max_scope. |
None
|
parents
|
Optional[List[str]]
|
A list of candidate parents for determining subsections |
None
|
parent_required
|
bool
|
Whether a parent is required for the section to exist in the final output. If true and no parent is identified, the section will be removed. |
False
|
metadata
|
Optional[Dict[Any, Any]]
|
Optional dictionary of any extra metadata. |
None
|
Source code in medspacy/section_detection/section_rule.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
from_dict(rule_dict)
classmethod
Reads a dictionary into a SectionRule list. Used when reading from a json file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rule_dict
|
the dictionary to convert |
required |
Returns:
| Name | Type | Description |
|---|---|---|
item |
the SectionRule created from the dictionary |
Source code in medspacy/section_detection/section_rule.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
from_json(filepath)
classmethod
Read in a lexicon of modifiers from a JSON file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
the .json file containing modifier rules |
required |
Returns:
| Name | Type | Description |
|---|---|---|
section_rules |
List[SectionRule]
|
a list of SectionRule objects |
Source code in medspacy/section_detection/section_rule.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
to_dict()
Converts TargetRules to a python dictionary. Used when writing section rules to a json file.
Returns:
| Name | Type | Description |
|---|---|---|
rule_dict |
the dictionary containing the TargetRule info. |
Source code in medspacy/section_detection/section_rule.py
123 124 125 126 127 128 129 130 131 132 133 134 135 | |