Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
scikit-learn
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Ian Johnson
scikit-learn
Commits
254b1091
Commit
254b1091
authored
12 years ago
by
Jaques Grobler
Committed by
Andreas Mueller
12 years ago
Browse files
Options
Downloads
Patches
Plain Diff
gael`s suggestions/tweaks
parent
6b04635d
Branches
Branches containing commit
Tags
Tags containing commit
No related merge requests found
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
examples/svm/plot_svm_scale_c.py
+37
-39
37 additions, 39 deletions
examples/svm/plot_svm_scale_c.py
with
37 additions
and
39 deletions
examples/svm/plot_svm_scale_c.py
+
37
−
39
View file @
254b1091
...
@@ -21,35 +21,35 @@ where
...
@@ -21,35 +21,35 @@ where
and our model parameters.
and our model parameters.
- :math:`\Omega` is a `penalty` function of our model parameters
- :math:`\Omega` is a `penalty` function of our model parameters
If we consider the
:math:`\mathcal{L}`
function to be the individual error per
If we consider the
loss
function to be the individual error per
sample, then the data-fit term, or the sum of the error for each sample, will
sample, then the data-fit term, or the sum of the error for each sample, will
increase as we add more samples. The penalization term, however, will not
increase as we add more samples. The penalization term, however, will not
increase.
increase.
When using, for example, :ref:`cross validation <cross_validation>`, to
When using, for example, :ref:`cross validation <cross_validation>`, to
set amount of regularization with
:math:
`C`, there will be a different
set amount of regularization with `C`, there will be a different
amount of samples between every problem that we are using for model
amount of samples between every problem that we are using for model
selection, as well as for the final problem that we want to use for
selection, as well as for the final problem that we want to use for
training.
training.
Since our loss function is dependant on the amount of samples, the latter
Since our loss function is dependant on the amount of samples, the latter
will influence the selected value of
:math:
`C`.
will influence the selected value of `C`.
The question that arises is `How do we optimally adjust C to
The question that arises is `How do we optimally adjust C to
account for the different training samples?`
account for the different training samples?`
The figures below are used to illustrate the effect of scaling our
The figures below are used to illustrate the effect of scaling our
:math:
`C` to compensate for the change in the amount of samples, in the
`C` to compensate for the change in the amount of samples, in the
case of using an
:math:
`L1` penalty, as well as the
:math:
`L2` penalty.
case of using an `L1` penalty, as well as the `L2` penalty.
L1-penalty case
L1-penalty case
-----------------
-----------------
In the
:math:
`L1` case, theory says that prediction consistency
In the `L1` case, theory says that prediction consistency
(i.e. that under given hypothesis, the estimator
(i.e. that under given hypothesis, the estimator
learned predicts as well as an model knowing the true distribution)
learned predicts as well as an model knowing the true distribution)
is not possible because of the biasof the
:math:
`L1`. It does say, however,
is not possible because of the biasof the `L1`. It does say, however,
that model consistancy, in terms of finding the right set of non-zero
that model consistancy, in terms of finding the right set of non-zero
parameters as well as their signs, can be achieved by scaling
parameters as well as their signs, can be achieved by scaling
:math:
`C1`.
`C1`.
L2-penalty case
L2-penalty case
-----------------
-----------------
...
@@ -59,17 +59,21 @@ as the number of samples grow, in order to keep prediction consistency.
...
@@ -59,17 +59,21 @@ as the number of samples grow, in order to keep prediction consistency.
Simulations
Simulations
------------
------------
The two figures below plot the values of
:math:
`C` on the `x-axis` and the
The two figures below plot the values of `C` on the `x-axis` and the
corresponding cross-validation scores on the `y-axis`, for several different
corresponding cross-validation scores on the `y-axis`, for several different
fractions of a generated data-set.
fractions of a generated data-set.
In the
:math:
`L1` penalty case, the results are best when scaling our
:math:
`C` with
In the `L1` penalty case, the results are best when scaling our `C` with
the amount of samples, `n`, which can be seen in the third plot of the first figure.
the amount of samples, `n`, which can be seen in the third plot of the first figure.
For the
:math:
`L2` penalty case, the best result comes from the case where
:math:
`C`
For the `L2` penalty case, the best result comes from the case where `C`
is not scaled.
is not scaled.
.. topic:: Note:
Two seperate datasets are used for the two different plots. The reason
behind this is the `L1` case works better on sparse data, while `L2`
is better suited to the non-sparse case.
"""
"""
print
__doc__
print
__doc__
...
@@ -116,9 +120,6 @@ colors = ['b', 'g', 'r', 'c']
...
@@ -116,9 +120,6 @@ colors = ['b', 'g', 'r', 'c']
for
fignum
,
(
clf
,
cs
,
X
,
y
)
in
enumerate
(
clf_sets
):
for
fignum
,
(
clf
,
cs
,
X
,
y
)
in
enumerate
(
clf_sets
):
# set up the plot for each regressor
# set up the plot for each regressor
pl
.
figure
(
fignum
,
figsize
=
(
9
,
10
))
pl
.
figure
(
fignum
,
figsize
=
(
9
,
10
))
pl
.
clf
pl
.
xlabel
(
'
C
'
)
pl
.
ylabel
(
'
CV Score
'
)
for
k
,
train_size
in
enumerate
(
np
.
linspace
(
0.3
,
0.7
,
3
)[::
-
1
]):
for
k
,
train_size
in
enumerate
(
np
.
linspace
(
0.3
,
0.7
,
3
)[::
-
1
]):
param_grid
=
dict
(
C
=
cs
)
param_grid
=
dict
(
C
=
cs
)
...
@@ -136,16 +137,13 @@ for fignum, (clf, cs, X, y) in enumerate(clf_sets):
...
@@ -136,16 +137,13 @@ for fignum, (clf, cs, X, y) in enumerate(clf_sets):
for
subplotnum
,
(
scaler
,
name
)
in
enumerate
(
scales
):
for
subplotnum
,
(
scaler
,
name
)
in
enumerate
(
scales
):
pl
.
subplot
(
2
,
1
,
subplotnum
+
1
)
pl
.
subplot
(
2
,
1
,
subplotnum
+
1
)
pl
.
xlabel
(
'
C
'
)
pl
.
ylabel
(
'
CV Score
'
)
grid_cs
=
cs
*
float
(
scaler
)
# scale the C's
grid_cs
=
cs
*
float
(
scaler
)
# scale the C's
pl
.
semilogx
(
grid_cs
,
scores
,
label
=
"
fraction %.2f
"
%
pl
.
semilogx
(
grid_cs
,
scores
,
label
=
"
fraction %.2f
"
%
train_size
)
train_size
)
pl
.
title
(
'
scaling=%s, penalty=%s, loss=%s
'
%
(
name
,
clf
.
penalty
,
clf
.
loss
))
pl
.
title
(
'
scaling=%s, penalty=%s, loss=%s
'
%
(
name
,
clf
.
penalty
,
clf
.
loss
))
#ymin, ymax = pl.ylim()
#pl.axvline(grid_cs[np.argmax(scores)], 0, 1,
# color=colors[k])
#pl.ylim(ymin=ymin-0.0025, ymax=ymax+0.008) # adjust the y-axis
pl
.
legend
(
loc
=
"
best
"
)
pl
.
legend
(
loc
=
"
best
"
)
pl
.
show
()
pl
.
show
()
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment