Dear Zhao,
smaring has several useful features, e.g. in helping with Brillouin zone integration on a finite grid where a fictitious electronic temperature is used to "smear out" the discontinuous integrant to a smooth function. Usually a larger smearing results in better converging electronic SCF cycles, but introduces a larger (fictitious) entropic term. See e.g.
Phys. Rev. B 107 195122, and references therein for more information about different smearing methods.
For a single atom and a single k point, the problem with very small smearing is touching on the limitation of the density mixer. Consider two orbitals that are very close in energy and close to the Fermi level. Let's say one of them is occupied, which changes the potential, which in turn changes the energy levels slightly, and now the other orbital is lower than the Fermi level, while the occupied one is pushed up. Then the occupation would jump, and this might go back and forth ad-infinite.
Smearing ensures that both states (remember, they are very close in energy) are partially occupied, the potential is smoothly updated, and convergence can happen.
So it is not a question of numerics, or not dealing with the fictitious entropic term correctly, but a question of partial occupancies and SCF stability.
Let me know if this leaves you less puzzled,
Michael