• Click prediction: How do you accurately predict if the user will click on an ad in less than a millisecond? Thankfully, you have billions of data points to help you.
• Recommender systems: A standard SVD works well. But what happens when you have to choose the top products amongst hundreds of thousands for every user, 2 billion times per day, in less than 50ms?
• Auction theory: In a second-price auction, the theoretical optimal is to bid the expected value. But what happens when you run 15 billion auctions per day against the same competitors?
• Explore/exploit: It's easy, UCB and Thomson sampling have low regret. But what happens when new products come and go and when each ad displayed changes the reward of each arm?
• Offline testing: You can always compute the classification error on model predicting the probability of a click. But is this really related to the online performance of a new model?
• Optimization: Stochastic gradient descent is great when you have lots of data. But what do you do when all data are not equal and you must distribute the learning over several hundred nodes?
CRITEO (http://labs.criteo.com/) | Paris, France | Senior Data Scientist | Full-time ONSITE | (VISA sponsorship)
You will be in charge of improving our Criteo Engine. This Engine is a key component of our business offer. Our technology takes an algorithmic approach to determining what user we show an ad to, when, and for what products.
We serve daily two billion unique adverts while keeping the latency under 10 ms. Our Data Scientists are responsible to use this 10 ms the most efficient way to estimate the value of each display and drive the best performance to our clients.
More specifically:
- Build and improve all our models
- Gather and analyze data to extract valuable information relevant to our models
- Identify key prediction problems and propose innovating solutions
- Report, visualize and communicate results
- Contribute to the exploration and creation of new scientific understanding
• Click prediction: How do you accurately predict if the user will click on an ad in less than a millisecond? Thankfully, you have billions of data points to help you.
• Recommender systems: A standard SVD works well. But what happens when you have to choose the top products amongst hundreds of thousands for every user, 2 billion times per day, in less than 50ms?
• Auction theory: In a second-price auction, the theoretical optimal is to bid the expected value. But what happens when you run 15 billion auctions per day against the same competitors?
• Explore/exploit: It's easy, UCB and Thomson sampling have low regret. But what happens when new products come and go and when each ad displayed changes the reward of each arm?
• Offline testing: You can always compute the classification error on model predicting the probability of a click. But is this really related to the online performance of a new model?
• Optimization: Stochastic gradient descent is great when you have lots of data. But what do you do when all data are not equal and you must distribute the learning over several hundred nodes?
CRITEO (http://labs.criteo.com/) | Paris, France | Full-time ONSITE | Team Lead – Live Traffic (VISA sponsorship)
Our team working on large scale/real-time traffic analysis and anomaly detection is looking for their lead to join them in Paris R&D. We would expect you to:
• Be creative: investigate and think out of the box to prevent us from fraud$
• Help to create brand new tools to detect and monitor suspicious traffic
• Play with Scala in our first R&D team using streaming
• Work closely with all departments, bridge the R&D and Product with Escalation if needed!
• Provide security guidance and expertise on anomaly management
Feel free to drop us a line at rndrecruitment[@]criteo.com =)
CRITEO (http://labs.criteo.com/) | Paris or Palo Alto | Full-time onsite |Software Engineer | (VISA sponsorship for Paris)
Your mission:
• Build systems that make the best decision in 50ms, half a million times per second. Across three continents and six datacenters, 24/7.
• Find the signal hidden in tens of TB of data, in one hour, using over a thousand nodes on our Hadoop cluster. And constantly keep getting better at it while measuring the impact on our business.
• Get stuff done. A problem partially solved today is better than a perfect solution next year. Have an idea during the night ? Code it in the morning, push it at noon, test it in the afternoon and deploy it the next morning.
• High stakes, high rewards: 1% increase in performance may yield millions for the company. But if a single bug goes through, the Internet goes down (we’re only half joking).
• Develop open source projects. Because we are working at the forefront of technology, we are dealing with problems that few have faced. We’re big users of open source, and we’d like to give back to the community.
• Click prediction: How do you accurately predict if the user will click on an ad in less than a millisecond? Thankfully, you have billions of data points to help you.
• Recommender systems: A standard SVD works well. But what happens when you have to choose the top products amongst hundreds of thousands for every user, 2 billion times per day, in less than 50ms?
• Auction theory: In a second-price auction, the theoretical optimal is to bid the expected value. But what happens when you run 15 billion auctions per day against the same competitors?
• Explore/exploit: It's easy, UCB and Thomson sampling have low regret. But what happens when new products come and go and when each ad displayed changes the reward of each arm?
• Offline testing: You can always compute the classification error on model predicting the probability of a click. But is this really related to the online performance of a new model?
• Optimization: Stochastic gradient descent is great when you have lots of data. But what do you do when all data are not equal and you must distribute the learning over several hundred nodes?
Be the guardian of systems that make the best decision in 50ms, half a million times per second. Across three continents and seven datacenters, 24/7.
• Assess the importance of technical issues, coordinate action within a team of 5+ people, resolve or escalate issues to the right level
• Investigate complex problems and find innovative answers to blocking issues
• Coordinate 400+ people and the operational teams to ensure that the most critical issues are taken care of fast and efficiently
• Help influence the R&D in improving the platform’s weak spots, develop tools to get an accurate map of the biggest issues, diagnose the platform automatically
• Be part of the level 2 on-duty team and help maintain a good level of service with some on call responsibilities
• You will implement measures to ensure an incident only happens once, and never again
CRITEO (http://labs.criteo.com/) | Paris - Palo Alto | ONSITE |Software Development Engineer (VISA sponsorship for Paris)
Your mission:
• Build systems that make the best decision in 50ms, half a million times per second. Across three continents and six datacenters, 24/7.
• Find the signal hidden in tens of TB of data, in one hour, using over a thousand nodes on our Hadoop cluster. And constantly keep getting better at it while measuring the impact on our business.
• Get stuff done. A problem partially solved today is better than a perfect solution next year. Have an idea during the night ? Code it in the morning, push it at noon, test it in the afternoon and deploy it the next morning.
• High stakes, high rewards: 1% increase in performance may yield millions for the company. But if a single bug goes through, the Internet goes down (we’re only half joking).
• Develop open source projects. Because we are working at the forefront of technology, we are dealing with problems that few have faced. We’re big users of open source, and we’d like to give back to the community.
Feel free to drop me a line for a chat =) n.rassam[at]criteo.com
• Click prediction: How do you accurately predict if the user will click on an ad in less than a millisecond? Thankfully, you have billions of data points to help you.
• Recommender systems: A standard SVD works well. But what happens when you have to choose the top products amongst hundreds of thousands for every user, 2 billion times per day, in less than 50ms?
• Auction theory: In a second-price auction, the theoretical optimal is to bid the expected value. But what happens when you run 15 billion auctions per day against the same competitors?
• Explore/exploit: It's easy, UCB and Thomson sampling have low regret. But what happens when new products come and go and when each ad displayed changes the reward of each arm?
• Offline testing: You can always compute the classification error on model predicting the probability of a click. But is this really related to the online performance of a new model?
• Optimization: Stochastic gradient descent is great when you have lots of data. But what do you do when all data are not equal and you must distribute the learning over several hundred nodes?
Missions of the team in more details: http://labs.criteo.com/wp-content/uploads/2015/04/Software-E...
Feel free to drop me a line at n.rassam[at]criteo.com