Modelo de clasificación de calidad de voz de llamadas de un Call Center mediante redes neuronales convolucionales y regresión logística
Loading...
Código QR
Authors
Castañeda Chilón, José Rolando
Contact Email
Abstract
La evaluación de las habilidades de manejo de voz en los asesores telefónicos de un Call Center es fundamental para garantizar una atención de calidad hacia los clientes. Tradicionalmente, esta tarea recae en analistas de calidad que, mediante la escucha directa de llamadas, identifican buenas o malas prácticas con el fin de reforzar o corregir comportamientos. No obstante, este método presenta un alto grado de subjetividad, ya que las evaluaciones dependen de la percepción individual de cada analista, lo cual puede derivar en calificaciones sesgadas que afectan la gestión del desempeño. Ante esta problemática, la presente investigación propone un enfoque automatizado para clasificar la calidad de voz emitida por los asesores, aplicando dos modelos de machine learning: Regresión Logística y Redes Neuronales Convolucionales (CNN). Como insumo principal se utilizaron los Coeficientes Cepstrales en la Frecuencia de Mel (MFCC), los cuales capturan la energía vocal en distintos rangos de frecuencia, representando así las características acústicas del saludo inicial del asesor. Para la Regresión Logística se construyó una matriz de 68 audios por 40 características (MFCC promedio), mientras que para las CNN se generaron espectrogramas como entradas visuales compuestas por 1003 puntos. Los datos fueron preprocesados y divididos en conjuntos de entrenamiento (70%) y prueba (30%). Los resultados evidenciaron un mejor desempeño del modelo CNN, el cual obtuvo un F1-score de 0.78, una precisión de 0.69 y un recall de 0.90. Este último valor destaca su capacidad para detectar eficazmente grabaciones con deficiente calidad vocal. En conjunto, estos hallazgos respaldan la implementación de modelos basados en aprendizaje profundo como una herramienta complementaria para mejorar la gestión de calidad en centros de atención al cliente, reduciendo la subjetividad y optimizando la toma de decisiones por parte de supervisores.
The evaluation of voice management skills in call center agents is essential to ensure high-quality customer service. Traditionally, this task is performed by quality analysts who listen to recorded calls to identify good and poor practices, aiming to reinforce or correct agent behavior. However, this method involves a high level of subjectivity, as assessments rely on the individual perception of each analyst, which can lead to biased scores and negatively impact performance management. To address this issue, this research proposes an automated approach for classifying voice quality using two machine learning models: Logistic Regression and Convolutional Neural Networks (CNN). The main input features were Mel-Frequency Cepstral Coefficients (MFCC), which capture vocal energy across different frequency ranges and represent the acoustic characteristics of the agent’s initial greeting. For Logistic Regression, a 68-by-40 matrix was constructed, where each row represents an audio recording and each column corresponds to the average MFCC values. In contrast, CNNs used spectrogram images generated from each set of 40 MFCCs, composed of 1003 data points, as these networks process visual representations. The dataset was preprocessed and split into training (70%) and testing (30%) sets. Results showed that the CNN model outperformed Logistic Regression, achieving an F1-score of 0.78, precision of 0.69, and recall of 0.90. The high recall indicates the model’s ability to effectively detect recordings that do not meet the expected quality standards. These findings support the effectiveness of deep learning-based models as a complementary tool to enhance quality management in customer service environments, minimizing subjectivity and streamlining decision-making for supervisors and team leads.
The evaluation of voice management skills in call center agents is essential to ensure high-quality customer service. Traditionally, this task is performed by quality analysts who listen to recorded calls to identify good and poor practices, aiming to reinforce or correct agent behavior. However, this method involves a high level of subjectivity, as assessments rely on the individual perception of each analyst, which can lead to biased scores and negatively impact performance management. To address this issue, this research proposes an automated approach for classifying voice quality using two machine learning models: Logistic Regression and Convolutional Neural Networks (CNN). The main input features were Mel-Frequency Cepstral Coefficients (MFCC), which capture vocal energy across different frequency ranges and represent the acoustic characteristics of the agent’s initial greeting. For Logistic Regression, a 68-by-40 matrix was constructed, where each row represents an audio recording and each column corresponds to the average MFCC values. In contrast, CNNs used spectrogram images generated from each set of 40 MFCCs, composed of 1003 data points, as these networks process visual representations. The dataset was preprocessed and split into training (70%) and testing (30%) sets. Results showed that the CNN model outperformed Logistic Regression, achieving an F1-score of 0.78, precision of 0.69, and recall of 0.90. The high recall indicates the model’s ability to effectively detect recordings that do not meet the expected quality standards. These findings support the effectiveness of deep learning-based models as a complementary tool to enhance quality management in customer service environments, minimizing subjectivity and streamlining decision-making for supervisors and team leads.
Description
Universidad Nacional Agraria La Molina. Escuela de Posgrado. Maestría en Estadística Aplicada
Keywords
Call Center
Citation
Date
2025
Collections
Seleccionar año de consulta:
Licencia de uso

Excepto si se señala otra cosa, la licencia del ítem se describe como info:eu-repo/semantics/openAccess

