API Reference

This section provides detailed documentation for the functions available in this project. To get started, import the functions using import pyvocals.


extract_features

extract_features(p1, p2, p1_label = 'Child', p2_label = 'Parent', start_time = None, fs = None)

Extract switching and interruptive turn events for each social partner.

Parameters

p1 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the first partner's vocalizations.

p2 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the second partner's vocalizations.

p1_label (str, optional)

The name of the first partner; by default, 'Child'.

p2_label (str, optional_)

The name of the second partner; by default, 'Parent'.

start_time (datetime.datetime, optional)

A datetime value denoting the start time of the vocalization data.

fs (int, optional)

The sampling rate of the vocalization instances. This value must be provided if start_time is not None.

Returns

dyad_vocals (pandas.DataFrame)

A DataFrame containing each partner's time series of extracted vocal states and turn-taking features.

Example

> p1 = [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0]
> p2 = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1]
> start_time = datetime.datetime(2025, 2, 23, 13, 23, 0)
> vocal_turns = pyvocals.extract_features(p1, p2, start_time = start_time, 
                                          fs = 1)
> vocal_turns.head()

Output:
            Timestamp  Child  Parent  Child_ST  Child_IT  Parent_ST  Parent_IT
0 2025-02-23 12:31:43      0       1       NaN       NaN        NaN        NaN
1 2025-02-23 12:31:44      0       1       NaN       NaN        NaN        NaN
2 2025-02-23 12:31:45      0       1       NaN       NaN        NaN        NaN
3 2025-02-23 12:31:46      5       1       NaN       1.0        NaN        NaN
4 2025-02-23 12:31:47      1       0       NaN       1.0        NaN        NaN

preprocess_audio()

preprocess_audio(file, start_time = None, target_fs = 4)

Pre-process vocalization data from a social partner's audio file into vocalization instances.

Parameters

file (str)

The filepath of the audio file (.wav, .mp3).

start_time (datetime.datetime, optional)

A datetime value denoting the start time of the audio file. If None, audio data will be resampled using the mean of values within the target sampling interval.

target_fs (int, optional)

The target sampling rate to which the resample the original signal; by default, 4 Hz.

Returns

signal (array-like)

An array containing the pre-processed vocalization signal.

get_vocal_states()

get_vocal_states(p1, p2, p1_label = 'Child', p2_label = 'Parent', start_time = None, fs = None)

Process vocalization instances of a dyad into vocal states. Numeric values of vocal states are as follows:

  • 1 = Vocalization
  • 2 = Pause
  • 3 = Switching pause
  • 4 = Non-interruptive simultaneous speech
  • 5 = Interruptive simultaneous speech

Parameters

p1 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the first partner's vocalizations.

p2 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the second partner's vocalizations.

p1_label (str, optional)

The name of the first partner; by default, 'Child'.

p2_label (str, optional)

The name of the second partner; by default, 'Parent'.

start_time (datetime.datetime, optional)

A datetime value denoting the start time of the vocalization data.

fs (int, optional)

The sampling rate of the vocalization instances. This value must be provided if start_time is not None.

Returns

tuple: If start_time = None, returns:

p1_vocal_states (array-like)

An array containing the first partner's processed vocal states.

p2_vocal_states (array-like)

An array containing the second partner's processed vocal states.

pandas.DataFrame: If start_time is not None, returns a DataFrame with the following columns:

  • 'Timestamp': Timestamped intervals.
  • 'P1': The first partner's processed vocal states.
  • 'P2': The second partner's processed vocal states.

Example

> p1 = [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0]
> p2 = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1]
> start_time = datetime.datetime(2025, 2, 23, 13, 23, 0)
> vocal_states = pyvocals.get_vocal_states(p1, p2, start_time = start_time, 
                                           fs = 1)
> vocal_states.head()

Output:
            Timestamp  Child  Parent
0 2025-02-23 12:31:43      0       1
1 2025-02-23 12:31:44      0       1
2 2025-02-23 12:31:45      0       1
3 2025-02-23 12:31:46      5       1
4 2025-02-23 12:31:47      1       0

References

Jaffe, J., & Feldstein, S. (1970). Rhythms of dialogue. Academic Press.


find_vocal_turns()

find_vocal_turns(p1, p2, fs = 4, max_pause_duration = 5)

Identify indices of when each person's switching and interruptive turns begin and end.

Parameters

p1 (array-like)

An array containing the first partner's vocal states.

p2 (array-like)

An array containing the second partner's vocal states.

fs (int, float)

The sampling rate of the vocalization instances; by default, 4 Hz.

max_pause_duration (int, float, optional)

The maximum allowable duration of a pause (in seconds) during which a vocal turn is still considered valid; by default, 5.

Returns

tuple:

p1_switching_turns (list)

A list of tuples containing indices denoting the start and end of the first partner's switching turns.

p1_interrupt_turns (list)

A list of tuples containing indices denoting the start and end of the first partner's interruptive turns.

p2_switching_turns (list)

A list of tuples containing indices denoting the start and end of the second partner's switching turns.

p2_interrupt_turns (list)

A list of tuples containing indices denoting the start and end of the second partner's interruptive turns.

find_pauses()

find_pauses(p1, p2)

Identify indices of two social partners' pause occurrences.

Parameters

p1 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the first partner's vocalizations.

p2 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the second partner's vocalizations.

Returns

pauses1 (array-like)

An array containing indices of the first partner's pauses.

pauses2 (array-like)

An array containing indices of the second partner's pauses.

find_switching_pauses()

find_switching_pauses(p1, p2)

Identify indices of two social partners' switching pause occurrences. A switching pause is defined as a pause bounded by the end of one partner's vocalization and the start of the other partner's vocalization.

Parameters

p1 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the first partner's vocalizations.

p2 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the second partner's vocalizations.

Returns

tuple:

p1_switching_pauses (array-like)

An array containing indices of the first partner's switching pauses.

p2_switching_pauses (array-like)

An array containing indices of the second partner's switching pauses.

find_simultaneous_speech()

find_simultaneous_speech(p1, p2)

Identify indices of two social partners' interruptive (ISS) and non-interruptive simultaneous speech (NSS) occurrences.

Parameters

p1 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the first partner's vocalizations.

p2 (array-like)

An array containing occurrences (1) and non-occurrences (0) of the second partner's vocalizations.

Returns

tuple:

p1_iss (array-like)

An array containing indices of the first partner's ISS occurrences.

p1_nss (array-like)

An array containing indices of the first partner's NSS occurrences.

p2_iss (array-like)

An array containing indices of the second partner's ISS occurrences.

p2_nss (array-like)

An array containing indices of the second partner's NSS occurrences.

plot_vocals()

plot_vocals(p1, p2, fs, seg_num = 1, seg_size = 15, p1_label = 'Child', p2_label = 'Parent')

Visualize two social partners' vocalization time series.

Parameters

p1 (array-like)

An array containing the first partner's vocalizations.

p2 (array-like)

An array containing the second partner's vocalizations.

fs (int)

The sampling rate of the input data.

seg_num (int, optional)

The segment number to visualize.

seg_size (int, optional)

The length of the segment (in seconds) to be visualized; by default, 15 seconds.

p1_label (str, optional)

The name of the first partner; by default, 'Child'.

p2_label (str, optional)

The name of the second partner; by default, 'Parent'.

Returns

fig (matplotlib.figure)

A figure containing two subplots, one for each social partner's vocal states.