Archive for the 'Tecnología' Category
Thursday, April 14th, 2011
The following algorithm computes the Least squares solution || Ax – b|| subject to the equality constrain Bx = d. It’s a classic algorithm that can be implemented only using a QR decomposition and a least squares solver. This implementation uses numpy and scipy. It makes use of the new linalg.solve_triangular function in scipy 0.9, [...]
Python, Tecnología | Comments (2)
Friday, February 11th, 2011
I was last weekend in FOSDEM presenting scikits.learn (here are the slides I used at the Data Analytics Devroom). Kudos to Olivier Grisel and all the people who organized such a fun and authentic meeting!
General, Tecnología | Comments Off
Friday, December 31st, 2010
Latest release of scikits.learn comes with an awesome collection of examples. These are some of my favorites: Faces recognition This example by Olivier Grisel, downloads a 58MB faces dataset from Labeled Faces in the Wild, and is able to perform PCA for feature extraction and SVC for classification, yielding a very acceptable 0.85 f1-score. Species [...]
General, scikit-learn, Tecnología | Comments Off
Monday, November 29th, 2010
Based on the work of libsvm-dense by Ming-Wei Chang, Hsuan-Tien Lin, Ming-Hen Tsai, Chia-Hua Ho and Hsiang-Fu Yu I patched the libsvm distribution shipped with scikits.learn to allow setting weights for individual instances. The motivation behind this is to be able force a classifier to focus its attention in some samples instead of others. This [...]
scikit-learn, Tecnología | Comments (1)
Wednesday, November 24th, 2010
Highlights for this release: * New stochastic gradient descent module by Peter Prettenhofer * Improved svm module: memory efficiency, automatic class weights. * Wrap for liblinear’s Multi-class SVC (option multi_class in LinearSVC) * New features and performance improvements of text feature extraction. * Improved sparse matrix support, both in main classes (GridSearch) as in sparse [...]
scikit-learn, Tecnología | Comments (1)
Saturday, October 30th, 2010
For some time now I’ve been missing a function in scipy that exploits the triangular structure of a matrix to efficiently solve the associated system, so I decided to implement it by binding the LAPACK method “trtrs”, which also checks for singularities and is capable handling several right-hand sides. Contrary to what I expected, binding [...]
scipy, Tecnología | Comments (3)
Thursday, September 30th, 2010
I’ve been working lately with Alexandre Gramfort coding the LARS algorithm in scikits.learn. This algorithm computes the solution to several general linear models used in machine learning: LAR, Lasso, Elasticnet and Forward Stagewise. Unlike the implementation by coordinate descent, the LARS algorithm gives the full coefficient path along the regularization parameter, and thus it is [...]
General, scikit-learn, Tecnología | Comments (4)
Thursday, May 27th, 2010
It is now possible (using the development version as of may 2010) to use Support Vector Machines with custom kernels in scikits.learn. How to use it couldn’t be more simple: you just pass a callable (the kernel) to the class constructor). For example, a linear kernel would be implemented as follows: import numpy as np [...]
General, scikit-learn, Tecnología | Comments (1)
Wednesday, March 17th, 2010
Suppose some given data points each belong to one of two classes, and the goal is to decide which class a new data point will be in. In the case of support vector machines, a data point is viewed as a p-dimensional vector (2-dimensional in this example), and we want to know whether we can [...]
General, scikit-learn, Tecnología | Comments (4)
Tuesday, March 9th, 2010
LibSVM is a C++ library that implements several Support Vector Machine algorithms that are commonly used in machine learning. It is a fast library that has no dependencies and most machine learning frameworks bind it in some way or another. LibSVM comes with a Python interface written in swig, but this interface is inherently slow [...]
General, scikit-learn, Tecnología | Comments (4)
Thursday, March 4th, 2010
Yesterday we had an extremely productive coding sprint for the scikits.learn. The idea was to put people with common interests in a room and make them work in a single codebase. Alexandre Gramfort and Olivier Grisel worked on GLMNet, Bertrand Thirion and Gaël Varoquaux worked on univariate feature selection and Vincent worked on Bayesian Regression. [...]
scikit-learn, Tecnología | Comments Off
Monday, February 1st, 2010
Today I released the first public version of Scikit-Learn (release notes). It’s a python module implementing some machine learning algorithms, and it’s shaping quite good. For this release I did not want to do any incompatible changes, so most of them are just bug fixes and updates. For the next release, however, some more radical [...]
General, scikit-learn, sympy, Tecnología | Comments (1)
Thursday, January 7th, 2010
This week we created a sourceforge project to host our development of scikit-learn. Although the project already had a directory in scipy’s repo, we needed more flexibility in the user management and in the mailing list creation, so we opted for SourceForge. To be honest, after using git and Google Code for bug tracking, I [...]
General, scikit-learn, Tecnología | Comments Off
Tuesday, December 22nd, 2009
This week I arrived to the place where I will be working the following two years: Neurospin. It’s a research center located 20 km from Paris, and so far things are going smoothly: the place is beautiful, work is great and food is excellent. Well OK, I do miss some things from Spain and weather [...]
General, Tecnología | Comments (2)
Tuesday, December 15th, 2009
My new job is about managing an open source package for machine learning in Python. I’ve had some experience with Python now, but I am a total newbie in the field of machine learning, so my first task will be to find a good reference book in the subject and start reading. The books I’ve [...]
General, Tecnología | Comments (10)
Saturday, September 5th, 2009
Google Summer of Code program is officially over. It has been four months of intense work, exciting benchmarks and patch reviewing. It was a huge pleasure working with you guys! As for the project, I implemented a complete logic module and then an assumption system for sympy (sympy.logic, sympy.assumptions, sympy.queries). I even had time to [...]
General, sympy, Tecnología | Comments (3)
Thursday, August 20th, 2009
I managed to overcome the overhead in ask() that arises when converting between symbol and integer representation of sentences in conjunctive normal. The result went beyond what I expected. The test suite for the query module got 10x times faster in my laptop. From 26 seconds, it descended to an impressive 2.03 secs. There is [...]
General, sympy, Tecnología | Comments (1)
Tuesday, August 18th, 2009
Today I’ve been doing some speed improvements for the logic module. More precisely, I implemented an efficient internal representation for clauses in conjunctive normal form. In practice this means a huge performance boost for all problems that make use the function satisfiable() or dpll_satisfiable(). For example, test_dimacs.py has moved from 2.7 seconds to an impressive [...]
General, sympy, Tecnología | Comments Off
Monday, August 17th, 2009
This commit introduced a new module in sympy: the refine module. The purpose of this module is to simplify expressions when they are bound to assumptions. For example, if you know that x>0, then you can simplify abs(x) to x. This code was traditionally embedded into the core, but now this will be part of [...]
sympy, Tecnología | Comments (3)
Monday, August 10th, 2009
The query module is finally in the main SymPy repository. I made substantial changes since last post, most of them at the user interface level (thanks to Vinzent and Mateusz for many insightful comments). Main function is ask(), which replaces the old expression.is_* syntax. You can ask many things. For example, you can ask if [...]
General, sympy, Tecnología | Comments Off