Material & Methods

Study subjects:

  • 36 free-ranging dogs living on the streets on the coast near Agadir, Morocco.
  • 35 free-ranging dogs that came from the streets but now live in a shelter. They were tested as a backup for the test-retest measure.


Dogs were tested individually during the morning hours. Two experimenters were involved: I as the main experimenter (E1) who conducted the test and a colleague (E2) who filmed and helped set up the tests. The entire test battery was conducted in direct succession.

Behavioural variables

The coded behaviours differed slightly across subtests and were coded when displayed either towards the experimenter or the object. In short, the main behavioural categories and behaviours were:

  • Proximity
  • Attitudes
  • Tail wagging
  • Gazing
  • Physical contact
  • Displacement behaviours
  • Vocalization
  • Subtest-specific behaviours
  • Disturbances
  • Non-visible
  • Termination

Reliability analysis

The intraclass correlation coefficient (ICC) was calculated to assess in how far the raters agreed within and between each other and whether the dogs behaved the same way in both tests. An ICC between -1 and 0.5 is commonly regarded as poor reliability, an ICC between 0.5 and 0.7 is moderate, and an ICC between 0.7 and 1.0 is good reliability.

  • For the coding reliability, 20% of the videos were recoded by the same coder after 3 weeks (intra-rater reliability) and 20% of the full-test videos were recoded by a second coder (inter-rater reliability). The ICC was calculated for each behaviour across all subtests with a two-way random effects ANOVA. Based on the abovementioned classification and previous dog personality tests, an ICC of > 0.7 and p-value < 0.01 was regarded as acceptable coding reliability.
  • For the test-retest reliability, the same dogs were tested again in the same test after 6 weeks (“retest”). All 35 of the shelter dogs and 23 (=72%) of the street dogs were found again for the retest. The test procedure was the same but the fake dog and novel object were slightly altered to create some novelty. The ICC was calculated for each behaviour in each subtest with a two-way mixed effects ANOVA. Only behaviours that occurred in more than 10% of all testing occasions were included in this analysis. Based on the abovementioned classification and previous dog personality tests, an ICC of > 0.5 and p-value < 0.05 was regarded as acceptable test-retest reliability.
  • For both analyses, subtests in which more than 70% of the subtest was coded as non-visible or disturbances were excluded from the analysis.