Xerox Data Scientist (Natural Language Processing) in Redmond, Washington

We are looking for an experienced machine learning / natural language processing data scientist to join a new and exciting team focused on delivering advanced analytics solutions in the electronic discovery and compliance industry. The advanced analytics team is at the center of providing insights and predictions in support of new strategic offerings and products, bringing real value to our clients through analysis of data.

As part of Xerox Legal Business Services you will work on developing the analytics-as-a-service platform, Created by Xerox scientists and legal experts the platform gives corporate counsel and law firms’ real-time visibility into the millions of documents they review and classify for their litigation, investigation and regulatory compliance matters.

You will have a proven track record of being able to tackle tough problems, bringing the right algorithmic solutions to large data sets, ensuring that any solution accurately fits the business problem at hand. As important is the ability to work as part of a growing team, being open-minded across a range of problems and techniques and a willingness to learn from and contribute to the team.

Xerox has a diverse services business, processing large amounts of data in domains which include healthcare, finance, transportation and customer care, among others. We also have world-leading research and development in all aspects of data, text and visual analytics across through five research centers across the globe.

Position Description – Data Scientist (Natural Language Processing)


Develop robust, scalable and maintainable machine learning models to answer business problems against large data sets

Build methods for document clustering, topic modeling, text classification, named entity recognition, sentiment analysis, and POS tagging

Perform elements of data cleaning, feature selection and feature engineering and organize experiments in conjunction with best practices

Benchmark, apply, and test algorithms against success metrics. Interpret the results in terms of relating those metrics to the business process

Work with development teams to ensure models can be implemented as part of a delivered solution replicable across many clients

Visualize data to tell compelling stories


3 years’ experience with Machine Learning, NLP, Document Classification, Topic Modeling and Information Extraction with a proven track record of applying them to real problems and real data

Experience working with big data systems and big data concepts

Ability to provide clear and concise communication both with other technical teams and non-technical domain specialists

Strong team player; ability to provide both a strong individual contribution but also work as a team and contribute to wider goals is a must in this dynamic environment

Experience with noisy and/or unstructured textual data

Strong coding ability with statistical analysis tools in Python or R, and general software development skills (source code management, debugging, testing, deployment, etc.)

Working knowledge of various text mining algorithms and their use-cases such as keyword extraction, PLSA, LDA, HMM, CRF, deep learning & recurrent ANN, word2vec/doc2vec,

Bayesian modeling

Strong understanding of text pre-processing and normalization techniques, such as tokenization, POS tagging and parsing and how they work at a low level.

Excellent problem solving skills.

Strong verbal and written communication skills

Practical experiences in using NLP related techniques and algorithms

Proven track record of analyzing data and bringing value across a range of projects

Experience in Greenplum and MadLib highly desirable

Experience in open source coding and communities desirable


Masters or higher in data mining or machine learning; or equivalent practical analytics / modelling experience

Xerox is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, creed, religion, ancestry, national origin, age, gender identity, sex, marital status, sexual orientation, physical or mental disability, use of a guide dog or service animal, military/veteran status, citizenship status, basis of genetic information, or any other group protected by Federal or State law or local ordinance. People with disabilities who need a reasonable accommodation to apply or compete for employment with Xerox Business Services, LLC may request such accommodation(s) by sending an e-mail to . Be sure to include your name, the accommodation you are seeking, and the job you are interested in.


Job: Product & Systems Design Engineering

Organization: XLS

Title: Data Scientist (Natural Language Processing)

Location: Washington-Redmond

Requisition ID: 16030129

Virtual/work from home? No