Inter-rater Reliability of Sustained Aberrant Movement Patterns as a Clinical Assessment of Muscular Fatigue
Abstract
Background:
The assessment of clinical manifestation of muscle fatigue is an effective procedure in establishing therapeutic exercise dose. Few studies have evaluated physical therapist reliability in establishing muscle fatigue through detection of changes in quality of movement patterns in a live setting.
Objective:
The purpose of this study is to evaluate the inter-rater reliability of physical therapists’ ability to detect altered movement patterns due to muscle fatigue.
Design:
A reliability study in a live setting with multiple raters.
Participants:
Forty-four healthy individuals (ages 19-35) were evaluated by six physical therapists in a live setting.
Methods:
Participants were evaluated by physical therapists for altered movement patterns during resisted shoulder rotation. Each participant completed a total of four tests: right shoulder internal rotation, right shoulder external rotation, left shoulder internal rotation and left shoulder external rotation.
Results:
For all tests combined, the inter-rater reliability for a single rater scoring ICC (2,1) was .65 (95%, .60, .71) This corresponds to moderate inter-rater reliability between physical therapists.
Limitations:
The results of this study apply only to healthy participants and therefore cannot be generalized to a symptomatic population.
Conclusion:
Moderate inter-rater reliability was found between physical therapists in establishing muscle fatigue through the observation of sustained altered movement patterns during dynamic resistive shoulder internal and external rotation.
BACKGROUND
Therapeutic exercise is defined as the systematic performance or execution of planned physical movements, postures, or activities intended to: 1) remediate or prevent impairments, 2) enhance function, 3) reduce risk, 4) optimize overall health, and 5) enhance fitness and well- being [1]. Therapeutic exercise has demonstrated its effectiveness for patients across broad areas of practice such as neurological, musculoskeletal, and cardiopulmonary conditions [2-4]. It is the cornerstone of physical therapy clinical practice and, in turn, makes up forty percent (40%) of government reimbursements for physical therapy [5, 6]. Despite therapeutic exercise’s high utilization and demonstrated usefulness, inconsistent clinical outcomes remain for several commonly treated conditions within the scope of physical therapy [2, 7]. Reported clinical and research variability in the effectiveness of exercise for certain conditions such as osteoarthritis of the knee, total hip arthroplasty, and shoulder impingement syndrome may not be due to the effect of therapeutic exercise on that condition. Rather, inconsistent effects may be due to the lack of appropriate intensity of the exercise program [2-4, 7, 8].
In order for an exercise and/or exercise program to be effective, an appropriate intensity is required to trigger the expected histological and physiological response. Intensity of an exercise is determined by its dose, of which two defining parameters are exercise resistance and number of repetitions performed [9-13]. To determine the appropriate resistance and repetitions, several methods and tests can be applied. For healthy subjects, the proper resistance is commonly based on the establishment of one repetition max (1RM). The procedure to establish a dynamic 1RM for an exercise requires multiple muscle contractions close to 1RM. After establishing 1RM, sub-maximal resistance levels can be established for healthy subjects by using existing reference tables and equations. Many factors influence the relationship between 1RM and sub-maximal intensity levels such as: overall training status, specificity of training status, genetic factors, gender, exercise specificity, type of muscle action, execution speed, and presence of disease or injury. Testing close to 1RM also requires a certain level of familiarity with the exercise and training status for safety. Therefore, it may be inappropriate to develop even a healthy subject’s training program based solely on the testing of 1RM. Further, a clinical setting involves injured or diseased tissue that may not tolerate the use of 1RM due to further risk of tissue injury or increased symptoms [14]. An alternate tool is the evaluation of the clinical manifestation of muscle fatigue, to which central and peripheral factors contribute. In healthy subjects, the major source of fatigue appears to be within the muscle fiber itself [15]. A study by Braun [16] indicates that intramuscular factors contributed 80% to the muscle fatigue with the remainder being attributable to central nervous system factors.
A wide variety of definitions exist for muscle fatigue. According to Bigland-Ritchie and Sogaard, “to circumvent this limitation, most investigators invoke a more focused definition of muscle fatigue as an exercise-induced reduction in the ability of muscle to produce force or power whether or not the task can be sustained [17, 18].” Enoka reports, “a critical feature of this definition is the distinction between muscle fatigue and the ability to continue the task. Accordingly, muscle fatigue is not the point of task failure or the moment when the muscle becomes exhausted. Rather, muscle fatigue is a decrease in the maximal force or power that the involved muscles can produce, and it develops gradually soon after the onset of the sustained physical activity [19, 20].”
Fatigue can be evaluated through several volitional and non-volitional means. Changes in EMG and torque output can be used as measurements of fatigue. Tests are very specific to the muscle group tested, the type of muscular contraction, the velocity of the movement, the movement range, and the equipment utilized [21]. Equipment requirements and limitations on testing functional movement patterns may make this impractical in a clinical setting. Tests relying on torque output may not take into consideration changes in a movement pattern that may occur prior to muscular exhaustion [22]. With these limitations established, how does a clinician ascertain the appropriate dose of dynamic exercises? In the 1960’s, Oddvar Holten developed the concept of Medical Exercise Therapy (MET), which promotes a submaximal testing procedure. The testing procedure determines the resistance based on the goal repetitions and the use of a theoretical curvilinear relationship between repetitions and percent of 1RM to guide the testing. In order to establish the exercise dose, the patient is asked to complete, with a given resistance, as many repetitions as possible to fatigue. It has been reported that muscle fatigue in the upper extremities may result in diminished proprioception and kinesthesia [23, 24], which leads to a decline in quality of movement. A decline in the quality of movement is considered one of the first observational clinical signs of local neuro-muscular fatigue and may even occur prior to the patient’s perception of fatigue [25]. Muscle fatigue may also be observed as a change in execution speed, and/or an alteration in movement range, and/or the appearance of accessory muscle contractions and movements. In this scenario, it is imperative for clinicians to effectively identify fatigue to prevent further injury and to appropriately dose the therapeutic exercise.
The proposed testing procedure relies on the clinician’s ability to detect changes in the quality of movement. In order for the testing procedure to be clinically useful, reliability must be established. The purpose of this study is to evaluate the inter-rater reliability of physical therapists’ ability to detect aberrant movement patterns due to muscle fatigue. Aberrant movement is defined as a deviation of the normal movement pattern and may be observed as changes in movement range, movement accelerations or decelerations, and out of plane movements due to a decrease of neuromuscular control around the physiological axis of movement. The ability to detect aberrant movements would allow a therapist to establish an appropriate dose for therapeutic exercise. It is hypothesized that good inter-rater reliability can be established for detecting altered movement patterns due to local muscular fatigue for repetitive glenohumeral external and internal rotation through visual assessment.
MATERIALS AND METHODOLOGY
Participants
Forty-seven (47) individuals, both men and women, over the age of eighteen (18) were recruited on a voluntary basis. Exclusion criteria included upper quarter pain within the last year, history of shoulder surgical repair dysfunction, lack of functional shoulder active range of motion, or a body mass index above thirty (30).
A screening was performed on all forty-seven (47) participants; however forty-four (44) participated in the complete testing. One student was ill on the day of testing and two did not show for the final testing. The screening process was completed to ensure participants did not meet the exclusion criteria and were fit to participate in the study. All participants were informed about the purpose, objectives, and methods of the study. Each participant read and signed an informed consent. The institutional review board at Andrews University approved this research.
Of the forty-four (44) subjects, 21 (47.7%) were females and 23 (52.3%) were males with an average age of 22.9 years (± 3.6 ; 19-35 years). Descriptive information for height, weight, BMI, hand dexterity, grip and isometric external and internal rotation shoulder strength is provided in Table 1.
Variable | N | % | X | SD | Min | Max |
---|---|---|---|---|---|---|
Gender | ||||||
Male | 23 | 52.3 | ||||
Female | 21 | 47.4 | ||||
Age In Years | 22.91 | 3.63 | 19.00 | 35.00 | ||
Height (cm) | 172.44 | 9.77 | 147.0 | 188.0 | ||
Weight (Kg) | 69.35 | 13.64 | 42.00 | 95.50 | ||
Body Mass Index | 23.14 | 3.14 | 18.10 | 29.90 | ||
Grip Strength R (Kg) | 42.15 | 11.69 | 23.30 | 65.50 | ||
Grip Strength L (Kg) | 39.42 | 11.35 | 20.50 | 64.40 | ||
Dexterity | ||||||
Right | 40 | 91 | ||||
Left | 4 | 9 | ||||
Right Shoulder Isometric ER | 10.03 | 2.56 | 3.90 | 15.30 | ||
Left Shoulder Isometric ER | 10.46 | 2.90 | 3.40 | 15.80 | ||
Right Shoulder Isometric IR | 13.10 | 3.10 | 5.80 | 21.00 | ||
Left Shoulder Isometric IR | 11.56 | 2.81 | 6.00 | 18.70 |
Six (6) physical therapist raters were recruited from local area outpatient orthopedic physical therapy clinics. Each was a licensed physical therapist and all were actively working throughout the states of Michigan and Indiana. They spent on average, more than 37 hours per week (24-40) in direct patient care. The average years of experience is 12.5 years (± 2.3; 10-15 years) with the majority having a Master’s degree (66.7%). Descriptive information pertaining to the raters is presented in Table 2. The investigators provided the raters with background information in the exercise technique and testing procedures in a live instructional session. They were allowed to practice scoring, and were given written instructions and a video detailing the testing procedures.
Variable | N | % | X | SD | Min | Max |
---|---|---|---|---|---|---|
Gender | ||||||
Male | 2 | 33.3 | ||||
Female | 4 | 66.7 | ||||
Age | 6 | 39.67 | 3.27 | 10 | 15 | |
Type of Degree | ||||||
Bachelor's | 1 | 16.7 | ||||
Master's | 4 | 66.7 | ||||
Doctoral | 1 | 16.7 | ||||
Specialty Certification | ||||||
No | 5 | 83.3 | ||||
Yes | 1 | 16.7 | ||||
Years in Practice | 6 | 12.50 | 2.59 | 35 | 45 | |
Hours Worked Per Week | 37.33 | 6.53 | 24 | 40 |
Equipment
Grip strength was assessed in kilograms using a Jamar Dynamometer (Lafayette Instrument Company, Inc, Lafayette, Indiana) for a series of three (3) repetitions with the maximal score used as the measurement. Isometric strength was measured in kilograms using a Lafayette Manual Muscle Test System (Lafayette Instrument Company, Inc, Lafayette, Indiana) For dynamic exercise, participants used the standard STEENS pulley system (STEENS Industrier, AS, Ski, Norway) with one hundred (100) gram resistance increments and a multi-purpose bench.
Procedure
Enrollment and Assessment for Eligibility
As shown in Fig. (1), the procedure for participants started with an assessment of eligibility. Screening included a questionnaire for past medical history related to upper quarter bilateral symptoms, age, gender, height in centimeters and dexterity. The physical exam included evaluation for posture, upper motor neuron dysfunction, cervical active range of motion, shoulder active range of motion, thoracic active range of motion, and shoulder girdle active range of motion deficits. The participants were also tested for grip strength in kilograms, isometric shoulder internal and external rotation strength in kilograms, and weight in kilograms.
Baseline Testing
Each participant was asked to perform four (4) isometric maximum strength tests: right shoulder external rotation, right shoulder internal rotation, left shoulder external rotation and left shoulder internal rotation. Each test involved a series of three (3) consecutive repetitions of isometric strength assessment with resistance at the wrist in neutral and shoulder in 30 degrees of flexion and 30 degrees of abduction. The maximal score was used as the measurement. Following isometric testing, each participant was placed in a sitting position on a multi-purpose bench with the elbow supported and the shoulder in an open packed position. The starting position included the elbow at ninety (90) degrees of flexion and the wrist in neutral. The pulley and bench were set up in a manner to allow for perpendicular angles of both the pulley rope with the humerus and the pulley rope with the forearm at mid-range. This allowed the rope to remain in the appropriate plane of movement. Resistance was provided by the STEENS pulley at approximately fifteen (15) to fifty (50) percent of the maximum isometric strength assessed during baseline testing. Each participant was then asked to complete active shoulder external (Fig. 2) or internal (Fig. 3) rotation to exhaustion. The screening resistance was adjusted to allow for an adequate number of repetitions to be assessed in the future testing session. Each participant was instructed in the testing procedures and given a testing appointment three weeks later to allow for full recovery from the baseline testing.
Testing Session
During live testing, each participant was identified by a pre-assigned number and placed as previously described on a multi-purpose bench with the resistance adjusted based on the earlier screening session. They were then instructed by the investigator to perform as many repetitions as possible, without counting, until they could not physically complete one more repetition. The raters were simultaneously instructed by the co-investigator to count and record the amount of repetitions performed prior to the appearance of a sustained altered movement pattern. A sustained altered movement pattern was defined as: change in speed, range of motion, wrist position, scapular movement, and/or elbow position for at least three (3) consecutive repetitions. Male participants were asked to remove their shirts and female participants were asked to wear halter tops during the testing to allow for full observation of the shoulder girdle and thorax.
A total of four (4) tests were completed for each participant: right shoulder external rotation, left shoulder external rotation, right shoulder internal rotation, and left shoulder internal rotation. Six (6) physical therapist raters simultaneously observed a total of one hundred seventy-six (176) tests. Each rater turned in their score sheet to the co-investigator for each participant. Both raters and participants were blinded from the results of this study.
Data Analysis
Frequencies were calculated for sex, type of degree, and specialty certifications of the six (6) rater physical therapists and reported in percentages of the sample. Descriptive statistics were calculated for age and years in practice of the six (6) rater physical therapists and expressed as median and range.
Descriptive analyses, including frequency for categorical variables and calculation of median and range for continuous variables were calculated for all forty-four (44) subjects. Due to the continuous data and multiple physical therapist raters used in this study, intraclass correlation coefficients (ICCs) were used for the reliability analysis with a confidence interval (CI) of 95%. We used l ICC form (2,1) for the calculation of the intraclass correlation coefficients. This single rating model allows for demonstrating confidence that trained physical therapists in a clinical setting would score a test equally [26]. The guidelines used for interpretation of the ICCs were as follows: 0.00 to 0.25 indicated little if any correlation; 0.26 to 0.49 indicated low correlation; 0.50 to 0.69 indicated moderate correlation; 0.70 to 0.89 indicated high correlation; and .090 to 1.00 indicated very high correlation [27]. All data were analyzed using SPSS 20.0 (SPSS Inc, Chicago, Illinois).
RESULTS
Each subject performed four (4) dynamic tests. A total of 176 tests were performed of which 175 tests were appropriately scored as one therapist was absent for one test of right shoulder external rotation. The range of repetitions for all scored tests was between 0 and 54 repetitions. Descriptive information for all scored tests per each rater is represented in Table 3. For all tests, the inter-rater reliability for a single rater scoring (ICC 2,1) was .65 (95% CI: . 60, .71). The highest inter-rater reliability for a single rater scoring (ICC 2,1) was found for right shoulder internal rotation with an ICC of . 68 (95% CI: .56, .78). The inter-rater reliability for a single rater scoring (ICC 2,1) for right shoulder external rotation, left shoulder external rotation and left shoulder internal rotation were respectively .65 (95% CI: .54, .76); .67 (95% CI: .55, .77) and .63 (95% CI: .51, .75).
N | Range | Minimum | Maximum | Mean | Std. Deviation | |
---|---|---|---|---|---|---|
tests | Reps | Reps | Reps | Reps | Reps | |
Rater 1 | 176 | 53 | 1 | 54 | 15.68 | 8.345 |
Rater 2 | 176 | 47 | 2 | 49 | 15.55 | 7.133 |
Rater 3 | 176 | 44 | 5 | 49 | 15.26 | 6.815 |
Rater 4 | 175 | 39 | 2 | 41 | 17.39 | 8.366 |
Rater 5 | 176 | 32 | 5 | 37 | 15.73 | 6.080 |
Rater 6 | 176 | 34 | 0 | 34 | 14.75 | 7.637 |
Valid N | 175 |
DISCUSSION
Sustained aberrant changes in movement patterns are the clinical signs of local muscular fatigue. This study investigated the inter-rater reliability of detecting changes in movement patterns due to local neuro-muscular fatigue. Six practicing physical therapists simultaneously observed a total of one hundred seventy-six (176) shoulder tests. A single-measures ICC (2,1) model was used to calculate single-rater reliability and the observations were performed in a live setting. This allowed for generalization to the clinical setting where a physical therapist would likely score a single test. Moderate correlations were found for all four (4) tests. These results compare well with other studies to determine inter-rater reliability of observational movement tests for upper extremities. McClure et al. [28] investigated the inter-rater reliability of six (6) raters (3 pairs) to classify shoulder dyskinesia through videotaped analysis and limited visual assessment in a live setting. The researchers found moderate inter-rater reliability with an average weighted Kappa .57 for live raters and .54 for those viewing videotape. They classified the scapular motion as normal, subtle dyskinesis, or obvious dyskinesis. McClure et al. felt their findings represented satisfactory reliability of clinicians working in a clinical setting. Kibler et al. [29] also used a visually based system for assessing shoulder dyskinesia and reported Kappa coefficients of .42 and .32 for inter-rater reliability among physical therapists and physicians. They concluded reliability was sufficient, with refinement, to allow utilization of this assessment in a clinical setting. Similar results have been reported in lower extremity studies. Davis et al. [30] evaluated inter-rater and intra-rater reliability of active hip abduction. They noted that previous research used movement observation as an appropriate assessment tool for clinicians. Their study used a four point assessment scale and demonstrates an inter-rater reliability ICC (2,1) of .70 (0.56, 0.84). The observational tests reported in the aforementioned studies all used limited scoring categories ranging from two (2) to four (4). The investigators could identify limited studies which used a larger scoring range and quality of movement as scoring parameters.
Moreland et al. [31] investigated the inter-rater reliability of six (6) trunk tests of which three were dynamic endurance tests. Three (3) raters tested thirty-nine (39) subjects during three (3) testing days. They found acceptable inter-rater reliability for dynamic abdominal endurance test (ICC .59), dynamic abdominal endurance test limited to 75 repetitions (ICC .89), and extensor dynamic endurance (ICC .78). The tests were controlled for speed and technique, and tests were stopped when pain occurred. It appeared the tests were dictated by task failure or exhaustion, rather than by subtle changes in movement pattern as is the case in this study.
Limitations and Opportunities for Further Research
This study has several limitations. The participants in this study were healthy and therefore the results of this study cannot be generalized to a symptomatic population. Although the testing protocol does not allow for pain or other symptoms to arise or increase, the movement changes which may occur in symptomatic individuals may not solely be related to the appearance of muscular fatigue. Fear of further injury may cause aberrant movement patterns to arise earlier than the onset of fatigue.
Although 1) movement analysis is an integral part of the skill set of physical therapists [32], 2) physical therapists believe they are skilled at accurately assessing movement through observation [33], 3) the six (6) physical therapists functioning as raters were experienced therapists and 4) education was provided in different formats to optimize learning, no assessment of the raters was performed in determining how accurately they could recognize a change in movement patterns. Only one therapist functioning as a rater was familiar with the assessment method prior to the study and uses it in daily clinical practice. The raters were in close proximity to each other. Although the raters were clearly instructed not to interact with each other during the testing sessions, the setting did not exclude potential visual and auditory interaction. This study used a design of six (6) physical therapist raters simultaneously evaluating change in movement patterns in a live dynamic exercise setting. Videotaped subjects with an expert consensus on when actual movement changes occurred and a pre-study competency assessment of the raters may have increased the reliability.
Clinical Implications
The current study provides support for the medical exercise therapy framework to establish appropriate dose of an exercise in a clinical setting. Practical and clinical limitations of available protocols may prevent therapists from establishing appropriate exercise intensity. In addition, a study from Haas et al. [34] investigated the clinical decision making in exercise prescription for fall prevention. It revealed evidence of dissonance between prescribing what was thought to be a correct dose on the basis of physiological considerations and prescribing based on a therapist’s perception of a patient’s adherence. The reliable observation of clinical fatigue allows the clinician to determine the appropriate exercise intensity by ensuring the correct resistance for the targeted amount of repetitions. For example, to increase muscular strength the amount of repetitions should be low and the resistance should be high. The results for right and left shoulder external rotation for two of the study participants are presented in Table 4. Participant A performed both exercises with a resistance of 1.75kg. The raters indicated a change in movement pattern occurred between 4 and 7 repetitions for right shoulder external rotation and between 5 and 7 for left shoulder external rotation, while exhaustion occurred at 14 and 12 repetitions respectively. Although variability exists at which point fatigue was observed in Participant A, based on the test results, all physical therapists using the medical exercise therapy framework would prescribe a similar exercise dose to obtain the physiological response for gaining muscular strength. Participant B performed external rotation with 2.0kg. The raters indicated a change in movement pattern occurred between 15 and 20 repetitions for right shoulder external rotation and between 12 and 17 for left shoulder external rotation, while exhaustion occurred at 37 and 32 repetitions respectively. Physical therapists would prescribe a similar exercise dose if the above scenario occurred in the clinic. While exhaustion is easier to identify in a clinical setting, it also increases the risk of further injury or increase of pain. Both examples for Participant A and Participant B demonstrate the difference between the number of repetitions for the occurrence of aberrant movement patterns and exhaustion. Traditional training protocols result in high metabolite accumulation and associated discomfort, pain, and exhaustion. Recent reports from Folland et al. and Izquierdo et al. suggest that similar gains can be achieved with lower levels of training stimuli with an appropriate level of fatigue [35, 36]. Therefore, we recommend that physical therapists prescribe an exercise dose based on the establishment of clinical fatigue.
Rater | 1 | 2 | 3 | 4 | 5 | 6 | Exhaustion |
---|---|---|---|---|---|---|---|
Participant A | Repetitions | ||||||
Right | 7 | 5 | 7 | 4 | 5 | 6 | 14 |
Left | 7 | 5 | 5 | 6 | 7 | 5 | 12 |
Participant B | Repetitions | ||||||
Right | 16 | 20 | 15 | 17 | 15 | 18 | 37 |
Left | 15 | 16 | 17 | 14 | 12 | 14 | 32 |
CONCLUSION
This study demonstrated clinical reliability of identification of changes in movement patterns by practicing physical therapists as an appropriate assessment tool of clinical fatigue. The assessment of clinical fatigue can be used to establish appropriate exercise dose for muscular strength and endurance gains. Suggestions for further research include investigating the reliability of assessment of clinical fatigue in a symptomatic population, the reliability of clinical fatigue assessment with different movement patterns, and the potential of increasing reliability with increased education.
CONFLICT OF INTEREST
The authors confirm that this article content has no conflict of interest.
ACKNOWLEDGEMENTS
The authors are thankful for John Carlos Jr. (carlosjj@andrews.edu) and Leo Wouters in serving as academic and clinical advisors. The authors also thank Eric Thordarson for assisting with the recruitment of participants, and the physical therapist raters in performing the assessments.