Glossary
Categorical Data
Refers to a type of qualitative data that is divided into categories or groups. Each piece of data belongs to one of these categories or groups, and cannot be further subdivided. It is useful for filtering or grouping continuous measures. Examples are speaker name, location, gender, etc.
Continuous Data
Refers to a type of data that can take on any numerical value within a range or interval. This type of data is usually measured or observed, and can be infinitely divided into smaller and smaller units. Examples of continuous data include age, score, ID, etc.
Dictionary-counted scores
Numerical scores from measures that are generated through language analysis by the Receptiviti API. The API analyzes one word at a time for dictionary-counted scores, and as each word is processed, the dictionary file is searched by category, looking for a category match with the current word. If the target word is matched with a category word, the appropriate word category scale (or scales) for that word is incremented. Most dictionary-counted scores are in the range of 0 to 1, with the exception of SALLEE, whose scores are in the range of -1 to +1.
Framework
Groups of measures designed to capture specific phenomena related to the psychology of a person derived through their use of language. For example, the Big 5 Personality framework uses a combination of LIWC measures that are designed to measure extraversion, (among others), and the combination of measures that make up the category of extraversion has a specific weight that affects the outcome of the score.
Input Values
Refers to any value within a file that you submit for analyses.
Language Style Matching
Language Style Matching (LSM) is a measure of the distance between two people's frequency scores in nine LIWC function word categories. Language Style Matching is highly predictive of a wide range of interpersonal outcomes and behaviors. When two people are paying attention to one another, their language tends to mirror one another’s. This process is similar to that which occurs with physical mirroring, but is more subconscious. Linguistic mirroring occurs at the grammatical level, measurable as the distance between the frequencies of different categories of function words. The minimum word count to analyze valid LSM is 50.
Measures
Categories of words that can be used to measure count or proportion of psychological relevance, which are gathered from spoken or written text. The Receptiviti API consists of measures that come from LIWC, as well as measures from our frameworks, which have algorithms built on top of the measures. Receptiviti frameworks consist of measures that are organized by their relevance to each other and as a whole for psycholinguistic analysis. Measures produce scores.
Normed scores
Numerical scores from measures that are generated through language analysis by the Receptiviti API. Normed (normalized) scores are baselined against our proprietary datasets, which consist of language samples that exceed 350 words. Normed measures will always provide scores in the range of 0 to 100. A language sample that generates a score of 80 implies that 80% of all samples in our curated baseline dataset have scores that are less than the score of the language sample being analyzed.
Taxonomy
A framework for assessing the use of particular topics found in language samples. They work like LIWC does in terms of counting instances of utterances relevant to the topic, with the key difference being that they are not used to try to reveal a person's psychology. Instead, words are counted words and placed in categories and output a score based on the percentage of words in a category within a text sample. Taxonomies are a binary measure, in that we are only looking for an instance of a theme or topic to flag in a given text sample. For example, in the Work-related Topics Taxonomy, a mention of the word ‘commute’ will increment the category simply because it has been mentioned.
Zero/Non-Zero
Turning on the zero/non-zero toggle filters your dataset, honing in on a subset of samples that score above zero on a chosen measure. Scores displayed in visuals where the zero/non-zero toggle has been used only represent scores for the sentences within a dataset where the chosen filter-by-measure scores are above zero.