Abstract:
One of the most important decisions we make in life is our career. This involves
much more than what we do for a living. It cannot be stressed enough, so it is im portant to choose a career that we are happy with. Most of us are not as fortunate
as others who simply know what they want to do and work in fulfilling jobs without
thinking about it. Many people choose a profession without thinking about it or for
the wrong reasons. They may choose professions that seem safe or lucrative. In this
way they become dissatisfied. A well thought-out decision is the best approach to
ensure that this does not happen to us.
The proposed work paves the way for students to choose their ideal career at
school age. Bloom’s Taxonomy forms the basis of the work. A hierarchical paradigm
called Bloom’s Taxonomy divides learning objectives into different levels of complex ity, ranging from basic understanding and knowledge to complicated assessment and
production. Bloom’s Taxonomy includes three domains of learning - cognitive, emo tional and psychomotor. Within each domain, learning can occur at different levels,
from basic to advanced. The cognitive domain is primarily concerned with intellectual
skills such as critical thinking, problem solving and knowledge building. The learner’s
attitudes, values, interests and appreciation are the core issues of the affective domain.
The learners’ ability to physically perform activities, execute movements and apply
skills falls under the psychomotor domain. In this case, the cognitive domain is used
in our work. Holland’s hypothesis forms the basis of the proposed model. The input
of the model is questions corresponding to RIASEC and the output is academic skills
and occupations based on the results of academic skills. The model was developed in
two phases. In the first phase, academic skills were predicted using Holland’s system.
Based on the academic skills acquired, the second stage involved the prediction of ca reer. Since there was no pre-existing data set, we collected the necessary information
ourselves. A Google form with 30 questions (5 questions each for Holland’s code) was
distributed to students and other participants. The Google form also included a field
for the applicant’s name and job title. 159 people answered our questions and we
used the resulting dataset for our work. The required dataset was collected and then
ii
set aside for preparation. Then the dataset was entered into Google Colab. After
pre-processing, Decision Tree, Random Forest, K-Nearest Neighbour and Multiout put Regressor algorithms were used to build machine learning models. The model
was trained on 80% of the collected data and then tested on the remaining 20%.
When comparing the accuracy of each model, the multi-output regressor showed a
maximum accuracy of 99.9%. Thus, the academic skills were successfully predicted.
The next step was to determine the ideal occupation based on the academic skills
acquired. Support Vector Machine, Gaussian Naive Bayes, Perceptron, Decision Tree
and Random Forest were used to build machine learning models. Along with the
academic skills acquired in the first phase of the work, the occupational information
from the dataset was used to train and test the model. When comparing the model
accuracy in this case, the SVM had the highest accuracy of 78.125 % and the career
was effectively predicted. Python in Google Colab was used for all this. We also tried
applying Synthetic Minority Oversampling Technique(SMOTE) to career prediction.
For some of the models, the accuracy decreased, while for others it improved. Of the
five models used, Gaussian Naive Bayes and Random Forest showed a decrease in
accuracy, while Perceptron achieved the highest accuracy of 80%.