NANO/MICROSCALE HEAT TRANSFER
ABOUT THE AUTHOR Zhuomin M. Zhang has taught at the University of Florida (1995–2002) a...

Author:
Zhuomin Zhang

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

NANO/MICROSCALE HEAT TRANSFER

ABOUT THE AUTHOR Zhuomin M. Zhang has taught at the University of Florida (1995–2002) and the Georgia Institute of Technology (since 2002). Professor Zhang is an ASME Fellow and has done cutting-edge research in the areas of micro/nanoscale heat transfer, with applications to optoelectronic devices and semiconductor manufacturing. He is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE), the ASME Heat Transfer Division Best Paper Award, and the AIAA Thermophysics Best Paper Award. Professor Zhang currently serves on the Editorial Board of the International Journal of Thermophysics and is an associate editor for the Journal of Quantitative Spectroscopy & Radiative Transfer and the Journal of Thermophysics and Heat Transfer.

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

NANO/MICROSCALE HEAT TRANSFER Zhuomin M. Zhang Georgia Institute of Technology Atlanta, Georgia

New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto

Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Manufactured in the United States of America. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher. 0-07-150973-9 The material in this eBook also appears in the print version of this title: 0-07-143674-X. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. For more information, please contact George Hoare, Special Sales, at [email protected] or (212) 904-4069. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise. DOI: 10.1036/007143674X

To my wife Lingyun

This page intentionally left blank

For more information about this title, click here

CONTENTS

Preface xiii List of Symbols

xvii

Chapter 1. Introduction

1

1.1 Limitations of the Macroscopic Formulation / 2 1.2 The Length Scales / 3 1.3 From Ancient Philosophy to Contemporary Technologies / 5 1.3.1 Microelectronics and Information Technology / 6 1.3.2 Lasers, Optoelectronics, and Nanophotonics / 8 1.3.3 Microfabrication and Nanofabrication / 10 1.3.4 Probing and Manipulation of Small Structures / 12 1.3.5 Energy Conversion Devices / 15 1.3.6 Biomolecule Imaging and Molecular Electronics / 17 1.4 Objectives and Organization of This Book / 19 References / 22

Chapter 2. Overview of Macroscopic Thermal Sciences

25

2.1 Fundamentals of Thermodynamics / 25 2.1.1 The First Law of Thermodynamics / 26 2.1.2 Thermodynamic Equilibrium and the Second Law / 27 2.1.3 The Third Law of Thermodynamics / 31 2.2 Thermodynamic Functions and Properties / 32 2.2.1 Thermodynamic Relations / 32 2.2.2 The Gibbs Phase Rule / 34 2.2.3 Specific Heats / 36 2.3 Ideal Gas and Ideal Incompressible Models / 38 2.3.1 The Ideal Gas / 38 2.3.2 Incompressible Solids and Liquids / 40 2.4 Heat Transfer Basics / 41 2.4.1 Conduction / 42 2.4.2 Convection / 44 2.4.3 Radiation / 46 2.5 Summary / 51 References / 51 Problems / 52

Chapter 3. Elements of Statistical Thermodynamics and Quantum Theory 3.1 Statistical Mechanics of Independent Particles / 58 3.1.1 Macrostates versus Microstates / 59 3.1.2 Phase Space / 59 vii

57

viii

3.2

3.3

3.4 3.5

3.6 3.7 3.8

CONTENTS

3.1.3 Quantum Mechanics Considerations / 60 3.1.4 Equilibrium Distributions for Different Statistics / 62 Thermodynamic Relations / 67 3.2.1 Heat and Work / 67 3.2.2 Entropy / 67 3.2.3 The Lagrangian Multipliers / 68 3.2.4 Entropy at Absolute Zero Temperature / 68 3.2.5 Macroscopic Properties in Terms of the Partition Function / 69 Ideal Molecular Gases / 71 3.3.1 Monatomic Ideal Gases / 71 3.3.2 Maxwell’s Velocity Distribution / 73 3.3.3 Diatomic and Polyatomic Ideal Gases / 75 Statistical Ensembles and Fluctuations / 81 Basic Quantum Mechanics / 82 3.5.1 The Schrödinger Equation / 82 3.5.2 A Particle in a Potential Well or a Box / 84 3.5.3 A Rigid Rotor / 86 3.5.4 Atomic Emission and the Bohr Radius / 88 3.5.5 A Harmonic Oscillator / 90 Emission and Absorption of Photons by Molecules or Atoms / 92 Energy, Mass, and Momentum in Terms of Relativity / 94 Summary / 96 References / 96 Problems / 96

Chapter 4. Kinetic Theory and Micro/Nanofluidics

101

4.1 Kinetic Description of Dilute Gases / 101 4.1.1 Local Average and Flux / 102 4.1.2 The Mean Free Path / 105 4.2 Transport Equations and Properties of Ideal Gases / 108 4.2.1 Shear Force and Viscosity / 109 4.2.2 Heat Diffusion / 110 4.2.3 Mass Diffusion / 112 4.2.4 Intermolecular Forces / 115 4.3 The Boltzmann Transport Equation / 116 4.3.1 Hydrodynamic Equations / 117 4.3.2 Fourier’s Law and Thermal Conductivity / 119 4.4 Micro/Nanofluidics and Heat Transfer / 121 4.4.1 The Knudsen Number and Flow Regimes / 122 4.4.2 Velocity Slip and Temperature Jump / 124 4.4.3 Gas Conduction—From the Continuum to the Free Molecule Regime / 129 4.5 Summary / 132 References / 132 Problems / 133

Chapter 5. Thermal Properties of Solids and the Size Effect 5.1 Specific Heat of Solids / 137 5.1.1 Lattice Vibration in Solids: The Phonon Gas / 137 5.1.2 The Debye Specific Heat Model / 139 5.1.3 Free Electron Gas in Metals / 143 5.2 Quantum Size Effect on the Specific Heat / 148 5.2.1 Periodic Boundary Conditions / 148 5.2.2 General Expressions of Lattice Specific Heat / 149 5.2.3 Dimensionality / 149 5.2.4 Thin Films Including Quantum Wells / 151 5.2.5 Nanocrystals and Carbon Nanotubes / 153

137

CONTENTS

ix

5.3 Electrical and Thermal Conductivities of Solids / 154 5.3.1 Electrical Conductivity / 155 5.3.2 Thermal Conductivity of Metals / 158 5.3.3 Derivation of Conductivities from the BTE / 160 5.3.4 Thermal Conductivity of Insulators / 162 5.4 Thermoelectricity / 166 5.4.1 The Seebeck Effect and Thermoelectric Power / 167 5.4.2 The Peltier Effect and the Thomson Effect / 168 5.4.3 Thermoelectric Generation and Refrigeration / 170 5.4.4 Onsager’s Theorem and Irreversible Thermodynamics / 172 5.5 Classical Size Effect on Conductivities and Quantum Conductance / 174 5.5.1 Classical Size Effect Based on Geometric Consideration / 174 5.5.2 Classical Size Effect Based on the BTE / 178 5.5.3 Quantum Conductance / 182 5.6 Summary / 187 References / 187 Problems / 190

Chapter 6. Electron and Phonon Transport

193

6.1 The Hall Effect / 193 6.2 General Classifications of Solids / 195 6.2.1 Electrons in Atoms / 195 6.2.2 Insulators, Conductors, and Semiconductors / 197 6.2.3 Atomic Binding in Solids / 199 6.3 Crystal Structures / 201 6.3.1 The Bravais Lattices / 201 6.3.2 Primitive Vectors and the Primitive Unit Cell / 204 6.3.3 Basis Made of Two or More Atoms / 206 6.4 Electronic Band Structures / 209 6.4.1 Reciprocal Lattices and the First Brillouin Zone / 209 6.4.2 Bloch’s Theorem / 210 6.4.3 Band Structures of Metals and Semiconductors / 214 6.5 Phonon Dispersion and Scattering / 217 6.5.1 The 1-D Diatomic Chain / 217 6.5.2 Dispersion Relations for Real Crystals / 219 6.5.3 Phonon Scattering / 221 6.6 Electron Emission and Tunneling / 226 6.6.1 Photoelectric Effect / 226 6.6.2 Thermionic Emission / 227 6.6.3 Field Emission and Electron Tunneling / 229 6.7 Electrical Transport in Semiconductor Devices / 232 6.7.1 Number Density, Mobility, and the Hall Effect / 232 6.7.2 Generation and Recombination / 236 6.7.3 The p-n Junction / 238 6.7.4 Optoelectronic Applications / 240 6.8 Summary / 242 References / 242 Problems / 244

Chapter 7. Nonequilibrium Energy Transfer in Nanostructures 7.1 Phenomenological Theories / 248 7.1.1 Hyperbolic Heat Equation / 250 7.1.2 Dual-Phase-Lag Model / 254 7.1.3 Two-Temperature Model / 258

247

x

CONTENTS

7.2 Heat Conduction Across Layered Structures / 262 7.2.1 Equation of Phonon Radiative Transfer (EPRT) / 263 7.2.2 Solution of the EPRT / 266 7.2.3 Thermal Boundary Resistance (TBR) / 271 7.3 Heat Conduction Regimes / 275 7.4 Summary / 278 References / 278 Problems / 281

Chapter 8. Fundamentals of Thermal Radiation

283

8.1 Electromagnetic Waves / 285 8.1.1 Maxwell’s Equations / 285 8.1.2 The Wave Equation / 286 8.1.3 Polarization / 288 8.1.4 Energy Flux and Density / 290 8.1.5 Dielectric Function / 291 8.1.6 Propagating and Evanescent Waves / 293 8.2 Blackbody Radiation: The Photon Gas / 294 8.2.1 Planck’s Law / 294 8.2.2 Radiation Thermometry / 298 8.2.3 Entropy and Radiation Pressure / 301 8.2.4 Limitations of Planck’s Law / 305 8.3 Radiative Properties of Semi-Infinite Media / 306 8.3.1 Reflection and Refraction of a Plane Wave / 306 8.3.2 Emissivity / 311 8.3.3 Bidirectional Reflectance / 312 8.4 Dielectric Function Models / 314 8.4.1 Kramers-Kronig Dispersion Relations / 314 8.4.2 The Drude Model for Free Carriers / 315 8.4.3 The Lorentz Oscillator Model for Lattice Absorption / 318 8.4.4 Semiconductors / 321 8.4.5 Superconductors / 325 8.4.6 Metamaterials with a Magnetic Response / 326 8.5 Summary / 329 References / 329 Problems / 330

Chapter 9. Radiative Properties of Nanomaterials 9.1 Radiative Properties of a Single Layer / 333 9.1.1 The Ray Tracing Method for a Thick Layer / 334 9.1.2 Thin Films / 335 9.1.3 Partial Coherence / 340 9.1.4 Effect of Surface Scattering / 344 9.2 Radiative Properties of Multilayer Structures / 346 9.2.1 Thin Films with Two or Three Layers / 347 9.2.2 The Matrix Formulation / 348 9.2.3 Radiative Properties of Thin Films on a Thick Substrate / 350 9.2.4 Local Energy Density and Absorption Distribution / 352 9.3 Photonic Crystals / 352 9.4 Periodic Gratings / 356 9.4.1 Rigorous Coupled-Wave Analysis (RCWA) / 358 9.4.2 Effective Medium Formulations / 360 9.5 Bidirectional Reflectance Distribution Function (BRDF) / 362 9.5.1 The Analytical Model / 363 9.5.2 The Monte Carlo Method / 364

333

CONTENTS

xi

9.5.3 Surface Characterization / 367 9.5.4 BRDF Measurements / 368 9.5.5 Comparison of Modeling with Measurements / 370 9.6 Summary / 372 References / 373 Problems / 374

Chapter 10. Near-Field Energy Transfer

377

10.1 Total Internal Reflection, Guided Waves, and Photon Tunneling / 378 10.1.1 The Goos-Hänchen Shift / 379 10.1.2 Waveguides and Optical Fibers / 382 10.1.3 Photon Tunneling by Coupled Evanescent Waves / 386 10.1.4 Thermal Energy Transfer between Closely Spaced Dielectrics / 389 10.1.5 Resonance Tunneling through Periodic Dielectric Layers / 391 10.1.6 Photon Tunneling with Negative Index Materials / 393 10.2 Polaritons or Electromagnetic Surface Waves / 395 10.2.1 Surface Plasmon and Phonon Polaritons / 396 10.2.2 Coupled Surface Polaritons and Bulk Polaritons / 401 10.2.3 Polariton-Enhanced Transmission of Layered Structures / 405 10.2.4 Radiation Transmission through Nanostructures / 408 10.2.5 Superlens for Perfect Imaging and the Energy Streamlines / 410 10.3 Spectral and Directional Control of Thermal Radiation / 414 10.3.1 Gratings and Microcavities / 417 10.3.2 Metamaterials / 421 10.3.3 Modified Photonic Crystals for Coherent Thermal Emission / 422 10.4 Radiation Heat Transfer at Nanometer Distances / 425 10.4.1 The Fluctuational Electrodynamics / 426 10.4.2 Heat Transfer between Parallel Plates / 428 10.4.3 Asymptotic Formulation / 430 10.4.4 Nanoscale Radiation Heat Transfer between Doped Silicon / 431 10.5 Summary / 436 References / 437 Problems / 440

Appendix A. Physical Constants, Conversion Factors, and SI Prefixes

443

Physical Constants / 443 Conversion Factors / 443 SI Prefixes / 443

Appendix B. Mathematical Background B.1 Some Useful Formulae / 445 B.1.1 Series and Integrals / 445 B.1.2 The Error Function / 446 B.1.3 Stirling’s Formula / 447 B.2 The Method of Lagrange Multipliers / 447 B.3 Permutation and Combination / 448 B.4 Events and Probabilities / 450 B.5 Distribution Functions and the Probability Density Function / 451 B.6 Complex Variables / 454 B.7 The Plane Wave Solution / 455 B.8 The Sommerfeld Expansion / 459 Index

461

445

This page intentionally left blank

PREFACE

Over the past 20 years, there have been tremendous developments in microelectronics, microfabrication technology, MEMS and NEMS, quantum structures (e.g., superlattices, nanowires, nanotubes, and nanoparticles), optoelectronics and lasers, including ultrafast lasers, and molecular- to atomic-level imaging techniques (such as high-resolution electron microscopy, scanning tunneling microscopy, atomic force microscopy, near-field optical microscopy, and scanning thermal microscopy). The field is fast moving into scaling up and systems engineering to explore the unlimited potential that nanoscience and nanoengineering may offer to restructure the technologies in the new millennia. When the characteristic length becomes comparable to the mechanistic length scale, continuum assumptions that are often made in conventional thermal analyses may break down. Similarly, when the characteristic time becomes comparable to the mechanistic timescale, traditional equilibrium approaches may not be appropriate. Understanding the energy transport mechanisms in small dimensions and short timescales is crucial for future advances of nanotechnology. In recent years, a growing number of research publications have been in nano/microscale thermophysical engineering. Timely dissemination of the knowledge gained from contemporary research to educate future scientists and engineers is of emerging significance. For this reason, more and more universities have started to offer courses in microscale areas. A self-contained textbook suitable for engineering students is much needed. Many practicing engineers who have graduated earlier wish to learn what is going on in this fascinating area, but are often frustrated due to the lack of a solid background to comprehend the contemporary literature. A book that does not require prior knowledge in statistical mechanics, quantum mechanics, solid state physics, and electrodynamics is extremely helpful. On the other hand, such a book should cover all these subjects in some depth without significant prerequisites. This book is written for engineering senior undergraduate and graduate students, practicing engineers, and academic researchers who have not been extensively exposed to nanoscale sciences but wish to gain a solid background in the thermal phenomena occurring at small length scales and short timescales. The basic philosophy behind this book is to logically integrate the traditional knowledge in thermal engineering and physics with newly developed theories in an easy-to-understand approach, with ample examples and homework problems. The materials have been used in the graduate course and undergraduate elective that I have taught a number of times at two universities since 1999. While this book can be used as a text for a senior elective or an entry-level graduate course, it is not expected that all the materials will be covered in a one-semester course. The instructors should have the freedom to select materials from the book according to students’ backgrounds and interests. Some chapters and sections can also be used to integrate with traditional thermal science courses in order to update the current undergraduate and graduate curricula with nanotechnology contents. The content of this book includes microscopic descriptions and approaches, as well as their applications in thermal science and engineering, with an emphasis on energy transport in gases and solids by conduction (diffusion) and radiation (with or without a medium), as well as convection in micro/nanofluidics. Following the introduction in xiii

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

xiv

PREFACE

Chapter 1, an in-depth overview on the foundation of macroscopic thermodynamics, heat transfer, and fluid mechanics is given in Chapter 2. Chapter 3 summarizes the well-established theories in statistical mechanics, including classical and quantum statistics; thermal properties of ideal gases are described in the context of statistical thermodynamics, followed by a concise presentation of quantum mechanics. Chapter 4 focuses on microfluidics and introduces the Boltzmann transport equation. The heat transfer and microflow regimes from continuous flow to free molecule flow are described. In Chapters 5, 6, and 7, heat transfer in solid nanostructures is extensively discussed. Chapter 5 presents the classical and quantum size effects on specific heat and thermal conductivity without involving detailed solid state theories, which are introduced in Chapter 6. This arrangement allows a more intuitive learning experience. Chapter 7 focuses on transient as well as nonequilibrium energy transport processes in nanostructures. The next three chapters deal with thermal radiation at nanoscales. Chapter 8 provides the fundamental understanding of electromagnetic waves and the dielectric properties of various materials. The concept of radiation entropy is also introduced, along with the recently demonstrated metamaterials with exotic properties. Chapter 9 describes interference effects of thin films and multilayers, the band structure of photonic crystals, diffraction from surface-relief gratings, and scattering from rough surfaces. Chapter 10 explores the evanescent wave and the coupling phenomena in the near field for energy transfer. Recent advances in nanophotonics and nanoscale radiation heat transfer are also summarized. The dual nature of particles and waves are emphasized throughout the book in explaining the energy carriers, such as molecules in ideal gases, electrons in metals, phonons in dielectric crystalline materials, and photons for radiative transfer. In the early 1990s, I was fortunate to work with Professor Markus Flik for my Ph.D. dissertation on the infrared spectroscopy of thin (down to 10 nanometers) high-temperature superconducting films for microfabricated, highly sensitive radiation detectors, as well as to assist him in the summer short course on microscale heat transfer at the Massachusetts Institute of Technology. While I was still a postdoctoral researcher, late Professor Chang-Lin Tien, then Chancellor of the University of California at Berkeley, wrote an invitation letter to me to give a seminar in the Department of Mechanical Engineering of Berkeley in January 1994; he continuously supported me, including the development of the concept of this book. The last time I heard from him was just a few weeks prior to the 2000 National Heat Transfer Conference in Pittsburgh, where he delivered a plenary speech before he fell ill. In his letter dated August 10, 2000, Professor Tien enthusiastically endorsed my plan to write a microscale textbook and encouraged me to include nano aspects. He wrote “I would like to express to you my strongest support for your project; however, I would suggest that you broaden the content somewhat beginning with the title to ‘Micro/Nanoscale Heat ...,’ and to talk about some coverage on nano aspects.” Professor Tien opened my eyes, and it took me several years afterward to complete this book, which now has more emphasis on nanoscale thermal sciences and engineering. I also benefited greatly from the encouragements and comments received through discussions with a large number of people in the heat transfer and thermophysics community, too many to be listed here. I am grateful to my colleagues and friends at both University of Florida (UF) and Georgia Tech for their help whenever needed. I especially want to thank Professor William Tiederman, who was Chair of the Department of Mechanical Engineering during my stay at UF, for his support and mentorship at the early stage of my independent research and teaching career. Professor David Tanner in the Department of Physics of UF helped me understand solid state physics; I have enjoyed collaboration with him since 1995. Through the years, Dr. Jack Hsia, former Chief of Academic Affairs at the National Institute of Standards and Technology (NIST), offered me much personal and professional advice. He is one of the many outstanding mentors I have had from NIST, where I gained my postdoctoral experience and worked for a number of summers afterward. This book

PREFACE

xv

would not have been possible without my graduate students’ hard work and dedication. Most of them have taken my classes and proofread different versions of the manuscript. Some materials in the last few chapters of the book were generated based on their thesis research. Many graduate and undergraduate students who have taken my classes or worked in the Nanoscale Thermal Radiation Lab also provided constructive suggestions. I enjoyed working with all of them. I must thank the Sponsoring Editor, Ken McCombs, for his endurance and persistence that kept me on the writing track over the past few years, and the whole production team, for carefully editing the manuscript and setting the final pages. While this project was partially supported by the National Science Foundation as part of my educational plan in the CAREER/PECASE grant, I take full responsibility for any inadvertent errors or mistakes. Finally, I thank my family for their understanding and support throughout the writing journey. My three children, Emmy, Angie, and Bryan, have given me great happiness and made my life meaningful. This book is dedicated to my wife Lingyun for the unconditional love and selfless care she has provided to me and to our children. ZHUOMIN M. ZHANG

This page intentionally left blank

LIST OF SYMBOLS

A Ac Al a a a0 al B C c c0 cv or –c v cp or –c p D D DAB d E E EF Eg e eb F, F F f G G g– g H H h hm –h – h I

area, m2; Helmholtz free energy, J cross-sectional area, m2 directional-spectral absorptance of a semitransparent material acceleration, m/s2 lattice constant, m; magnitude of acceleration, m/s2 Bohr radius, 0.0529 nm absorption coefficient, m1 magnetic induction or magnetic flux density, T (tesla) or Wb/m2 volumetric heat capacity (rcp), J/(m3 K) phase velocity of electromagnetic wave, m/s speed of light in vacuum, 2.998 108 m/s mass or molar specific heat for constant volume, J/(kg K) or J/(kmol K) mass or molar specific heat for constant pressure, J/(kg K) or J/(kmol K) dynamical matrix; electric displacement, C/m2 density of states, m3; diameter, m binary diffusion coefficient, m2/s diameter or film thickness, m electric field vector, N/C or V/m energy, J; magnitude of electric field, V/m Fermi energy, J bandgap energy, J electron charge (absolute value), 1.602 1019 C blackbody emissive power, W/m2 force, N normalized distribution function distribution function (sometimes normalized) reciprocal lattice vector, m1; dyadic Green function Gibbs free energy, J; electron-phonon coupling constant, W/(m3 K) molar specific Gibbs free energy, J/kmol degeneracy magnetic field vector, A/m or C/(m s) enthalpy, J; magnetic field strength, A/m or C/(m s) mass specific enthalpy, J/kg; convection heat transfer coefficient, W/(m2 K); Planck’s constant, 6.626 1034 J s convection mass transfer coefficient, m/s Planck’s constant divided by 2p, h/2p molar specific enthalpy, J/kmol unit matrix; unit dyadic

xvii

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

xviii

LIST OF SYMBOLS

I

moment of inertia, kg m2; intensity or radiance, W/(m2 m sr); electric current, A

i i, j, k J or J J, Je JE Jm JN Jp K k k kB L L0 Ll l l, m, n M m mr m* m˙ N NA ND N˙ n

21 indices used in series flux vector or magnitude (quantity transferred per unit area per unit time) current density (also called electric charge flux), A/m2 energy flux, W/m2 mass flux, kg/(s m2) particle flux, m2 momentum flux, Pa (N/m2) spring constant, N/m; Thomson’s coefficient, V/K; Bloch wavevector, m1 wavevector, m1 magnitude of the wavevector, m1 Boltzmann’s constant, 1.381 1023 J/K characteristic length, m average distance between molecules or atoms, m radiation entropy intensity, W/(K m2 m sr) length, m index numbers molecular weight, kg/kmol mass of a system or a single particle, kg reduced mass, kg effective mass, kg mass flow rate or mass transfer rate, kg/s number of particles; number of phonon oscillators Avegadro’s constant, 6.022 1026 kmol1; acceptor concentration, m3 donor concentration, m3 particle flow rate, s1 number density, m3; quantum number; real part of refractive index or refractive index amount of substance, kmol complex refractive index propagation matrix; polarization vector or dipole moment per unit volume, C/m2 pressure, Pa (N/m2) momentum flux component, Pa – momentum vector (mv . or h k), kg m/s – momentum (mv or h k ), kg m/s; probability; specularity index numbers heat, J, quality factor heat transfer rate, W number of coexisting phases; number of atoms per molecule thermal energy generation rate, W/m3 heat flux vector, W/m2 heat flux, W/m2 gas constant, J/(kg K); electrical resistance, or V/A directional-hemispherical reflectance thermal boundary resistance, m2 K/W thermal resistance, m2 K/W universal gas constant, 8314.5 J/(kmol K)

n– n~ P P Pi j p p p,q Q . Q q q. q″ q″ R R′ Rb″ Rt″ – R

LIST OF SYMBOLS

r re ~ r S S Sj . S . S gen s or –s . sgen s″ T T′ t ~ t U ud u or u– V v vB vR v va vF vg vl, vt vp vx, vy, vz –v W x, y, z Z

distance or radius, m; Fresnel reflection coefficient electrical resistivity, m complex Fresnel reflection coefficient Poynting vector, W/m2 entropy, J/K strength of the jth phonon oscillator entropy transfer rate, W/K entropy generation rate, W/K specific entropy, J/(kg K), J/(m3 K) or J/(kmol K) volumetric entropy generation rate, W/(m3 K) entropy flux, J/(m2 K) temperature, K directional-hemispherical transmittance time, s; Fresnel transmission coefficient complex Fresnel transmission coefficient internal energy, J; periodic potential; J drift velocity, m/s specific internal energy: mass specific, J/kg, and volume specific (i.e., energy density), J/m3, or molar specific, J/kmol volume, m3; voltage, V velocity, m/s bulk or mean velocity, m/s random or thermal velocity, m/s specific volume, m3/kg; speed, m/s speed of sound or average speed of phonons, m/s Fermi velocity, m/s magnitude of group velocity (dv/dk), m/s longitudinal, transverse phonon speed, m/s phase speed (v/k), m/s velocity components, m/s molar specific volume, m3/kmol; average speed, m/s work, J; width, m coordinates, m partition function

DIMENSIONLESS PARAMETERS Kn Le Lz Ma Nu Pe Pr Re Sc ZT

xix

Knudsen number, /L Lewis number, DAB/a Pr/Sc Lorentz number, k/sT Mach number, v/va Nusselt number, hL/k Peclet number, RePr v L/a Prandtl number, n/a Reynolds number, rn L/m Schmidt number, n/DAB dimensionless figure of merit for thermoelectricity

LIST OF SYMBOLS

xx

GREEK SYMBOLS a a and b aT av av′ a¢l b bP bT bv ij S g gs d dl e e~ e¢l hH Θ ΘD u uB uc k kT Λ Λa l m mF n n Π r re r¢ s srms sSB s¢SB

thermal diffusivity, m2/s; other constant Lagrangian multipliers thermal accommodation coefficient (tangential) momentum accommodation coefficient normal momentum accommodation coefficient directional-spectral absorptivity phase shift, rad; various coefficients isobaric thermal expansion coefficient, K1 2g(2 aT)Kn/[aT (g 1)Pr] (2 av)Kn/av hemispherical transmissivity for phonons from medium i to j Seebeck’s coefficient, V/K specific heat ratio (cp /cv); scattering rate (1/t), rad/s Sommerfeld constant, J/(kg K2) differential small quantity; boundary layer thickness, m radiation penetration depth, m particle energy, J; electric permittivity, C2/(N m2); ratio of permittivity to that of vacuum; emissivity complex dielectric function, i.e., ratio of permittivity to that of vacuum directional-spectral emissivity Hall coefficient Ey /JxB, m3/C characteristic temperature, K Debye temperature, K zenith angle, rad Brewster’s angle, rad critical angle, rad thermal conductivity, W/(m K); extinction coefficient (i.e., imaginary part of the refractive index) isothermal compressibility, Pa1 mean free path, m; period of a grating or photonic crystal, m average distance between collisions, m wavelength, m viscosity, N s/m2; chemical potential, J; electron or hole mobility, m2/(V s), magnetic permeability, N/A2; ratio of the permeability to that of vacuum Fermi energy, J kinematic viscosity, m2/s; frequency, Hz wavenumber, cm1 Peltier’s coefficient, V density, kg/m3 charge density, C/m3 directional-hemispherical reflectivity electrical conductivity, (Ω m)1; standard deviation root-mean-square surface roughness, m Stefan-Boltzmann constant, 5.67 108 W/(m2 K4) phonon Stefan-Boltzmann constant, W/(m2 K4)

LIST OF SYMBOLS

t t¢ t12 Φ f Ψ c Ω v vp v

relaxation time, s; shear stress, Pa directional-hemispherical transmissivity transmission coefficient scattering phase function; viscous dissipation function; potential function number of degrees of freedom; azimuthal angle, rad; intermolecular potential Schrödinger’s wavefunction; various functions molecular quantity; wavefunction; work function, J solid angle, sr; thermodynamic probability angular frequency, rad/s plasma frequency, rad/s velocity space, dv dvxdvydvz

SUBSCRIPTS 0 1, 2, 3 b d e h i i, j, k, l, m, n m mp n or p p r s t th v w l, n, or v

vacuum medium 1, 2, 3 blackbody; boundary defect or impurity electron hole incident indices bulk or mean; maximum; medium most probable n-type or p-type semiconductor TM wave or p (parallel) polarization reflected; rotational TE wave or s (perpendicular) polarization; scattered; surface; solid; lattice transmitted; translational thermal vibrational wall free steam spectral property in terms of wavelength, frequency, or angular frequency

xxi

This page intentionally left blank

NANO/MICROSCALE HEAT TRANSFER

This page intentionally left blank

CHAPTER 1

INTRODUCTION

Improvement in performance and shrinkage of device sizes in microelectronics have been major driving forces for scientific and economic progress over the past 30 years. Developments in semiconductor processing and surface sciences have allowed precise control over critical dimensions with desirable properties for solid state devices. In the past 20 years, there have been tremendous developments in micro- and nanoelectromechanical systems (MEMS and NEMS), microfluidics and nanofluidics, quantum structures and devices, photonics and optoelectronics, nanomaterials for molecular sensing and biomedical diagnosis, and scanning probe microscopy for measurement and manipulation at the molecular and atomic levels. Nanotechnology research has not only emerged as a new area of science and engineering, but it has also become an integral part of almost all natural science and engineering disciplines. According to the Web site of Georgia Institute of Technology (www.gatech.edu), more than 10% of the faculty members at the university have been involved with research projects related to nanoscience and nanoengineering. The same can be said for most major research universities in the United States and in many other countries. Furthermore, the study of nanoscience and nanoengineering requires and has resulted in close interactions across the boundaries of many traditional disciplines. Knowledge of physical behavior at the molecular and atomic levels has played and will continue to play an important role in our understanding of the fundamental processes occurring in the macro world. This will enable us to design and develop novel devices and machines, ranging from a few nanometers all the way to the size of automobiles and airplanes. We have already enjoyed camera phones and the iPod that can store thousands of pictures and songs. In the next few decades, the advancement of nano/microscale science and engineering will fundamentally restructure the technologies currently used in manufacturing, energy production and utilization, communication, transportation, space exploration, and medicine.1,2 A key issue associated with miniaturization is the tremendous increase in the heat dissipation per unit volume. Micro/nanostructures may enable engineered materials with unique thermal properties to allow significant enhancement or reduction of the heat flow rate. Therefore, knowledge of thermal transport from the micrometer scale down to the nanometer scale and thermal properties of micro/nanostructures is of critical importance to future technological growth. Solutions to more and more problems in small devices and systems require a solid understanding of the heat (or more generally, energy) transfer mechanisms in reduced dimensions and/or short time scales, because classical equilibrium and continuum assumptions are not valid anymore. Examples are the thermal analysis and modeling of micro/nanodevices, ultrafast laser interaction with materials, micromachined thermal sensors and actuators, thermoelectricity in nanostructures, photonic crystals, microscale thermophotovoltaic devices, and so forth.3,4

1 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

2

CHAPTER 1

1.1 LIMITATIONS OF THE MACROSCOPIC FORMULATION As an ancient Chinese philosopher put it, suppose you take a foot-long wood stick and cut off half of it each day; you will never reach an end even after thousands of years, as shown in Fig. 1.1. Modern science has taught us that, at some stage, one would reach the molecular

FIGURE 1.1 The length of the wood stick: l1 1 ft in day 1, l2 1/2 ft in day 2, and ln 1/2n1 ft in day n.

level and even the atomic level, below which the physical and chemical properties are completely different from those of the original material. The wooden stick or slice would eventually become something else that is not distinguishable from the other constituents in the atmosphere. Basically, properties of materials at very small scales may be quite different from those of the corresponding bulk materials. Note that 1 nm (nanometer) is one-billionth of a meter. The diameter of a hydrogen atom H is on the order of 0.1 nm, and that of a hydrogen molecule H2 is approximately 0.3 nm. Using the formula ln 0.3048>2n1 m, where n is number of days, we find l30 5.7 1010 m (or 0.57 nm) after just a month, which is already near the diameter of a hydrogen atom. While atoms can still be divided with large and sophisticated facilities, our ability to observe, manipulate, and utilize them is very limited. On the other hand, most biological processes occur at the molecular level. Many novel physical phenomena happen at the length scale of a few nanometers and can be integrated into large systems. This is why the nanometer is a critical length scale for the realization of practically important new materials, structures, and phenomena. For example, carbon nanotubes with diameters ranging from 0.4 to 50 nm or so have dramatically different properties. Some researchers have shown that these nanotubes hold promise as the building block of nanoelectronics. Others have found that the thermal conductivity of single-walled carbon nanotubes at room temperature could be an order of magnitude higher than that of copper. Therefore, carbon nanotubes have been considered as a candidate material for applications that require a high heat flux. In conventional fluid mechanics and heat transfer, we treat the medium as a continuum, i.e., indefinitely divisible without changing its physical nature. All the intensive

INTRODUCTION

3

properties can be defined locally and continuously. For example, the local density is defined as r lim

dV S 0

dm dV

(1.1)

where dm is the mass enclosed within a volume element dV . When the characteristic dimension is comparable with or smaller than that of the mechanistic length—for example, the molecular mean free path, which is the average distance that a molecule travels between two collisions—the continuum assumption will break down. The density defined in Eq. (1.1) will depend on the size of the volume, dV , and will fluctuate with time even at macroscopic equilibrium. Noting that the mean free path of air at standard atmospheric conditions is about 70 nm, the continuum assumption is well justified for many engineering applications until the submicrometer regime or the nanoscale is reached. Nevertheless, if the pressure is very low, as in an evacuated chamber or at a high elevation, the mean free path can be very large; and thus, the continuum assumption may break down even at relatively large length scales. Within the macroscopic framework, we calculate the temperature distribution in a fluid or solid by assuming that the medium under consideration is not only a continuum but also at thermodynamic equilibrium everywhere. The latter condition is called the localequilibrium assumption, which is required because temperature can be defined only for stableequilibrium states. With extremely high temperature gradients at sufficiently small length scales and/or during very short periods of time, the assumption of local equilibrium may be inappropriate. An example is the interaction between short laser pulses and a material. Depending on the type of laser, the pulse duration or width can vary from a few tens of nanoseconds down to several femtoseconds (1 fs 1015 s). In the case of ultrafast laser interaction with metals, free electrons in the metal could gain energy quickly to arrive at an excited state corresponding to an effective temperature of several thousand kelvins, whereas the crystalline lattices remain near room temperature. After an elapse of time represented by the electron relaxation time, the excess energy of electrons will be transferred to phonons, which are energy quanta of lattice vibration, thereby causing a heating effect that raises the temperature or changes the phase of the material under irradiation. Additional mechanisms may affect the behavior of a system as the physical dimensions shrink or as the excitation and detection times are reduced. A scale-down of the theories developed from macroscopic observations often proves to be unsuitable for applications involving micro/nanoscale phenomena. Examples are reductions in the conductivity of thin films or thin wires due to boundary scattering (size effect), discontinuous velocity and temperature boundary conditions in microfluidics, wave interferences in thin films, and tunneling of electrons and photons through narrow gaps. In the quantum limit, the thermal conductance of a nanowire will reach a limiting value that is independent of the material that the nanowire is made of. At the nanoscale, the radiation heat transfer between two surfaces can exceed that calculated from the Stefan-Boltzmann law by several orders of magnitude. Another effect of miniaturization is that surface forces (such as shear forces) will scale down with L2, where L is the characteristic length, while volume forces (such as buoyancy) will scale down with L3. This will make surface forces predominant over volume forces at the microscale.

1.2 THE LENGTH SCALES It is instructive to compare the length scales of different phenomena and structures, especially against the wavelength of the electromagnetic spectrum. Figure 1.2 compares the wavelength ranges with some characteristic dimensions. One can see that MEMS generally produce micromachining capabilities from several millimeters down to a few micrometers.

CHAPTER 1

4

Currently, the smallest feature of integrated circuits is well below 100 nm. The layer thickness of thin films ranges from a few nanometers up to several micrometers. The wavelengths of the visible light are in the range from approximately 380 to 760 nm. On the other hand, thermal radiation covers a part of the ultraviolet, the entire visible and infrared, and a portion of the microwave region. The thickness of human hair is between 50 and 100 m, while the diameter of red blood cells is about 6 to 8 m. A typical optical microscope can magnify 100 times with a resolution of 200 to 300 nm, which is about half the wavelength and is limited due to the diffraction of light. Therefore, optical microscopy is commonly used to study micrometer-sized objects. On the other hand, atoms and molecules are on the order of 1 nm, which falls in the x-ray and electron-beam wavelength region. Therefore, xray and electron microscopes are typically used for determining crystal structures and defects, as well as for imaging nanostructures. The development of scanning probe microscopes (SPMs) and near-field scanning optical microscopes (NSOMs) in the 1980s enabled unprecedented capabilities for the visualization and manipulation of nanostructures, such as nanowires, nanotubes, nanocrystals, single molecules, individual atoms, and so forth, as will be discussed in Sec. 1.3.4. Figure 1.2 also shows that the mean free path of heat Wavelength range Radio waves

Microwave

Characteristic dimension 10 cm 1 cm

Thermal Radiation

1 mm Far-IR

Mid-IR Near-IR Visible Ultraviolet

100 µm 10 µm 1 µm 100 nm 10 nm

Deep UV 1 nm X-ray E-beam

0.1 nm

Semiconductor wafers Human hands and fingers Microelectronic chips

Thickness of wafers (substrates) Sand grains Thickness of human hair MEMS Pollen grains; blood cells Thickness of films or coatings Surface roughness; mean free paths Smallest feature of integrated circuits Nanoparticles, nanorods, nanowires Nanotubes, SPM tips Molecules Lattice constants Atoms

FIGURE 1.2 Characteristic length scales as compared with the wavelength of electromagnetic spectrum.

carriers (e.g., molecules in gases, electrons in metals, and phonons or lattice vibration in dielectric solids) often falls in the micrometer to nanometer scales, depending on the material, temperature, and type of carrier. This book is motivated by the need to understand the thermal phenomena and heat transfer processes in micro/nanosystems and at very short time scales for solving problems occurring in contemporary and future technologies. A brief historical retrospective is given in the next section on the development of modern science and technologies, with a focus

INTRODUCTION

5

on the recent technological advances leading to nanotechnology. The role of thermal engineering throughout this technological advancement is outlined.

1.3 FROM ANCIENT PHILOSOPHY TO CONTEMPORARY TECHNOLOGIES Understanding the fundamentals of the composition of all things in the universe, their movement in space and with time, and the interactions between one and another is a human curiosity and the inner drive that makes us different from other living beings on the earth. The ancient Chinese believed that everything was composed of the five elements: metal, wood, water, fire, and earth (or soil) that generate and overcome one another in certain order and time sequence. These simple beliefs were not merely used for fortune-telling but have helped the development of traditional Chinese medicine, music, military strategy, astronomy, and calendar. In ancient Greece, the four elements (fire, earth, air, and water) were considered as the realm wherein all things existed and whereof all things consisted. These classical element theories prevailed in several other countries in somewhat different versions for over 2000 years, until the establishment of modern atomic theory that began with John Dalton’s experiment on gases some 200 years ago. In 1811, Italian chemist Amedeo Avogadro introduced the concept of the molecule, which consists of stable systems or bound state of atoms. A molecule is the smallest particle that retains the chemical properties and composition of a pure substance. The first periodic table was developed by Russian chemist Dmitri Mendeleev in 1869. Although the original meaning of atom in Greek is “indivisible,” subatomic particles have since been discovered. For example, electrons as a subatomic particle were discovered in 1897 by J. J. Thomson, who won the 1906 Nobel Prize in Physics. An atom is known as the smallest unit of one of the 116 confirmed elements so far. The first industrial revolution began in the late eighteenth century and boosted the economy of western countries from manual labor to the machine age by the introduction of machine tools and textile manufacturing. Following the invention of the steam engine in the mid-nineteenth century, the second industrial revolution had an even bigger impact on human life through the development of steam-powered ships and trains, along with the internal combustion engines, and the generation of electrical power. Newtonian mechanics and classical thermodynamics have played an indispensable role in the industrial revolutions. The development of machinery and the understanding of the composition of matter have allowed unprecedented precision of experimental investigation of physical phenomena, leading to the establishment of modern physics in the early twentieth century. The nature of light has long been debated. At the turn of the eighteenth century, Isaac Newton formulated the corpuscular theory of light and observed with his prism experiment that sunlight is composed of different colors. In the early nineteenth century, the discovery of infrared and ultraviolet radiation and Young’s double-slit experiment confirmed Huygens’ wave theory, which was overshadowed by Newton’s corpuscular theory for some 100 years. With the establishment of Maxwell’s equations that fully describe the electromagnetic waves and Michelson’s interferometric experiment, the wave theory of radiation had been largely accepted by the end of the nineteenth century. While the wave theory was able to explain most of the observed phenomena, it could not explain thermal emission over a wide spectrum, nor was it able to explain the photoelectric effect. Max Planck in 1901 used the hypothesis of light, or radiation quanta, or oscillators, to successfully derive the blackbody spectral distribution function. In 1905, Albert Einstein explained the photoelectric effect based on the concept of radiation quanta. To knock out an electron from the metal surface, the energy of each incoming radiation quantum (h ) must be sufficiently large because one electron can absorb only one quantum. This explained why photoemission could not occur

6

CHAPTER 1

at frequencies below the threshold value, no matter how intense the incoming radiation might be. It appears that light is not indefinitely divisible but must exist in multiples of the smallest massless quanta, which are known as photons. In 1924, Louis de Broglie hypothesized that particles should also exhibit wavelike characteristics. With the electron diffraction experiment, it was found that electrons indeed can behave like waves with a wavelength inversely proportional to the momentum. Electron microscopy was based on the principle of electron diffraction. The wave-particle duality was essential to the establishment of quantum mechanics in the early twentieth century. Quantum mechanics describes the phenomena occurring in minute particles, structures, and their interaction with radiation, for which classical mechanics and electrodynamics are not applicable. The fundamental scientific understanding gained during the first half of the twentieth century has facilitated the development of contemporary technologies that have transformed from the industrial economy to the knowledge-based economy and from the machine age to the information age. The major technological advancements in the last half of the century are highlighted in the following sections.

1.3.1 Microelectronics and Information Technology In his master’s thesis at MIT published in 1940, Claude Shannon (1916–2001) used the Boolean algebra and showed how to use TRUE and FALSE to represent function of switches in electronic circuits. Digital computers were invented during the 1940s in several countries, including the IBM Mark I which is 2.4 m high and 16 m long. In 1948, while working at Bell Labs, Shannon published an article, “A Mathematical Theory of Communication,” which marked the beginning of modern communication and information technology.5 In that paper, he laid out the basic principles of underlying communication of information with two symbols, 1 and 0, and coined the term “bit” for a binary digit. His theory made it possible for digital storage and transmission of pictures, sounds, and so forth. In December 1947, three scientists invented the semiconductor point contact transistor at Bell Labs using germanium. The earlier computers and radios were based on bulky vacuum tubes that generated a huge amount of heat. The invention of the transistor by William Shockley, John Bardeen, and Walter Brattain was recognized through the Nobel Prize in Physics conferred on them in 1956. There had been intense research on semiconductor physics using the atomic theory and the mechanism of point contact for the fabrication of transistors to become possible. The invention of transistors ushered the information age with a whole new industry. In 1954, Gordon Teal at Texas Instruments built the first silicon transistor. The native oxide of silicon appeared to be particularly suitable as the electric insulator. In 1958, Jack Kilby (1923–2005) at Texas Instruments was able to cramp all the discrete components onto a silicon base and later onto one piece of germanium. He filed a patent application the next year on “Miniaturized Electronic Circuits,” where he described how to make integrated circuits and connect the passive components via gold wires. Working independently, Robert Noyce at Fairchild Electronics in California found aluminum to adhere well to both silicon and silicon oxide and filed a patent application in 1959 on “Semiconductor Deviceand-Lead Structure.” Kilby and Noyce are considered the coinventors of integrated circuits. Noyce was one of the founders of Intel and died in 1990. Kilby was awarded half of the Nobel Prize in Physics in 2000 “for his part in the invention of the integrated circuit.” (See http://nobelprize.org/nobel_prizes/). The other half was shared by Zhores Alferov and Herbert Kroemer for developing semiconductor heterogeneous structures used in optoelectronics, to be discussed in the next section. In 1965, around 60 transistors could be packed on a single silicon chip. Seeing the fast development and future potential of integrated circuits, Gordon Moore, a cofounder of

INTRODUCTION

7

Intel, made a famous prediction that the number and complexity of semiconductor devices would double every year.6 This is Moore’s law, well-known in the microelectronics industry. In the mid-1970s, the number of transistors on a chip increased from 60 to 5000. By 1985, the Intel 386 processor contained a quarter million transistors on a chip. In 2001, the Pentium 4 processor reached 42 million transistors. The number has now exceeded 1 billion per chip in 2006. When the device density is plotted against time in a log scale, the growth almost follows a straight line, suggesting that the packaging density has doubled approximately every 18 months.6 Reducing the device size and increasing the packaging density have several advantages. For example, the processor speed increases by reducing the distance between transistors. Furthermore, new performance features can be added into the chip to enhance the performance. The cost for the same performance also reduces. Advanced supercomputer systems have played a critical role in enabling modeling and understanding micro/nanoscale phenomena. The process is first to grow high-quality silicon crystals and then dice and polish into wafers. Devices are usually made on SiO2 layer that can be grown by heating the wafer to sufficiently high temperatures in a furnace with controlled oxygen partial pressure. The wafers are then patterned using photolithographic techniques combined with etching processes. Donors and acceptors are added to the wafer to form n- and p-type regions by ion implantation and then annealed in a thermal environment. Metals or heavily doped polycrystalline silicon are used as gates with proper coverage and patterns through lithography. A schematic of metal-oxide-semiconductor field-effect transistor (MOSFET) is shown in Fig. 1.3. Millions of transistors can be packed in 1-mm2 area with several layers

FIGURE 1.3 Schematic of a metal-oxide-semiconductor field-effect transistor (MOSFET).

through very-large-scale integration (VLSI) with the smallest features smaller than 100 nm. As mentioned earlier, managing heat dissipation is a challenge especially as the device dimension continues to shrink. Local heating or hot spots on the size of 10 nm can cause device failure. The principles governing the heat transfer at the nanoscale are very different from those at large scales. A fundamental understanding of the phonon transport is required for device-level thermal analysis. Furthermore, understanding heat transfer in microfluidics is necessary to enable reliable device cooling at the micro- and nanoscales. Additional discussions will be given in subsequent chapters of the book. The progress in microelectronics is not possible without the advances in materials such as crystal growth and thermal processing during semiconductor manufacturing, as well as the deposition and photolithographic technologies. Rapid thermal processing (RTP) is necessary during annealing and oxidation to prevent ions from deep diffusion into the wafer. Thermal modeling of RTP must consider the combined conduction, convection, and radiation modes. A lightpipe thermometer is commonly used to monitor the temperature of the wafer. In an RTP furnace, the thermal radiation emitted by the wafer is collected by the light pipe and then transmitted to the radiometer for inferring the surface temperature.7 In some cases,

8

CHAPTER 1

the wafer surface is rough with anisotropic features. A better understanding of light scattering by anisotropic rough surfaces is also necessary. According to the International Technology Roadmap for Semiconductors, the gate length and the junction depth will be 25 and 13.8 nm, respectively, for the 65-nm devices used in high-performance complementary-metal-oxide-semiconductor (CMOS) technology.8 High-intensity Ar or Xe arc lamps with millisecond optical pulses are considered as a suitable annealing tool following ion implantation in ultra-shallow junction fabrication. Because the optical energy is absorbed within milliseconds, thermal diffusion cannot distribute heat uniformly across the wafer surface. Therefore, temperature uniformity across the nanometer-patterned wafer is expected to be a critical issue. To reduce the feature size further, deep-UV lithography and x-ray lithography have also been developed. It is predicted that Moore’s law will reach its limit in 2017, when the critical dimensions would be less than 10 nm. Further reduction will be subjected to serious barriers due to problems associated with gate dielectrics and fabrication difficulties. Molecular nanoelectronics using self-assembly is sought as an alternative, along with quantum computing. Therefore, nanoelectronics and quantum computing are anticipated to brighten the electronics and computer future.

1.3.2 Lasers, Optoelectronics, and Nanophotonics It is hard to imagine what the current technology would look like without lasers. Lasers of different types have tremendous applications in metrology, microelectronics fabrication, manufacturing, medicine, and communication. Examples are laser printers, laser bar code readers, laser Doppler velocimetry, laser machining, and laser corneal surgery for vision correction. The concept of laser was demonstrated in late 1950s independently in the United States and the Soviet Union during the cold war. The Nobel Prize in Physics of 1964 recognized the fundamental contributions in the field of quantum electronics by Charles Townes, Nicolay Basov, and Aleksandr Prokhorov. The first working laser was a Ruby laser built by Theodore Maiman at Hughes Aircraft Company in 1960. The principle of lasers dates back to 1917, when Einstein elegantly depicted his conception of stimulated emission of radiation by atoms. Unlike thermal emission and plasma emission, lasers are coherent light sources and, with the assistance of optical cavity, lasers can emit nearly monochromatic light and point to the same direction with little divergence. Lasers enabled a branch of nonlinear optics, which is important to understand the fundamentals of light-matter interactions, communication, as well as optical computing. In 1981, Nicolaas Bloembergen and Arthur Schawlow received the Nobel Prize in Physics for their contributions in laser spectroscopy. There are a variety of nonlinear spectroscopic techniques, including Raman spectroscopy, as reviewed by Fan and Longtin.9 Two-photon spectroscopy has become an important tool for molecular detection.10 Furthermore, two-photon 3-D lithography has also been developed for microfabrication.11 Gas lasers such as He-Ne (red) and Ar (green) have been extensively used for precision alignment, dimension measurements, and laser Doppler velocimetry due to their narrow linewidth. On the other hand, powerful Nd:YAG and CO2 lasers are used in thermal manufacturing, where the heat transfer processes include radiation, phase change, and conduction.12 Excimer lasers create nanosecond pulses in ultraviolet and have been extensively used in materials processing, ablation, eye surgery, dermatology, as well as photolithography in microelectronics and microfabrication. High-energy nanosecond pulses can also be produced by Q-switching, typically with a solid state laser such as Nd:YAG laser at a wavelength near 1 m. On the other hand, mode-locking technique allows pulse widths from picoseconds down to a few femtoseconds. Pulse durations less than 10 fs have been achieved since 1985. Ultrafast lasers have enabled the study of reaction dynamics and formed a branch in chemistry called femtochemistry. Ahmed Zewail of Caltech received the

INTRODUCTION

9

1999 Nobel Prize in Chemistry for his pioneering research in this field. In 2005, John Hall and Thoedor Hänsch received the Nobel Prize in Physics for developing laser-based precision spectroscopy, in particular, the frequency comb technique. Short-pulse lasers can facilitate fabrication, the study of electron-phonon interaction in the nonequilibrium process, measurement of thermal properties including interface resistance, nondestructive evaluation of materials, and so forth.13–16 Room-temperature continuous-operation semiconductor lasers were realized in May 1970 by Zhores Alferov and coworkers at the Ioffe Physical Institute in Russia, and independently by Morton Panish and Izuo Hayashi at Bell Labs a month later. Alferov received the Nobel Prize in Physics in 2000, together with Herbert Kroemer who conceived the idea of double-heterojunction laser in 1963 and was also an earlier pioneer of molecular beam epitaxy (MBE). Invented in 1968 by Alfred Cho and John Arthur at Bell Labs and developed in the 1970s, MBE is a high-vacuum deposition technique that enables the growth of highly pure semiconductor thin films with atomic precision. The name heterojunction refers to two layers of semiconductor materials with different bandgaps, such as GaAs/AlxGa1-x As pair. In a double-heterojunction structure, a lower-bandgap layer is sandwiched between two higher-bandgap layers.17 When the middle layer is made thin enough, on the order of a few nanometers, the structure is called a quantum well because of the discrete energy levels and enhanced density of states. Quantum well lasers can have better performance with a smaller driving current. Multiple quantum wells (MQWs), also called superlattices, that consist of periodic structures can also be used to further improve the performance. In a laser setting, an optical cavity is needed to confine the laser bandwidth as well as enhance the intensity at a desired wavelength with narrow linewidth. Distributed Bragg reflectors (DBRs) are used on both ends of the quantum well (active region). DBRs are the simplest photonic crystals made of periodic dielectric layers of different refractive indices; each layer thickness is equal to a quarter of the wavelength in that medium (l/n). DBRs are dielectric mirrors with nearly 100% reflectance, except at the resonance wavelength l, where light will eventually escape from the cavity. Figure 1.4 illustrates a vertical cavity surface emitting laser (VCSEL), where light is emitted through the substrate (bottom of the structure). The energy transfer mechanisms through phonon waves and electron waves have been extensively investigated.18

FIGURE 1.4 Schematic of a VCSEL laser made of heterogeneous quantum well structure. The smaller layer thickness can be 3 nm, and there can be as many as several hundred layers.

10

CHAPTER 1

Further improvement in the laser efficiency and control of the wavelength has been made using quantum wires and quantum dots (QDs).17 Semiconductor lasers are the most popular lasers (in quantity), and several hundred-million units are sold each year. Their applications include CD/DVD reading/writing, optical communication, laser pointers, laser printers, bar code readers, and so forth. A simpler device is the light-emitting diode (LED), which emits incoherent light with a two-layer p-n junction without DBRs. LEDs have been used for lighting, including traffic lights with improved efficiency and decorating lights. The development of wide-bandgap materials, such as GaN and AlN epitaxially grown through metal-organic chemical vapor deposition (MOCVD), allows the LED and semiconductor laser wavelength to be pushed to the blue and ultraviolet. Organic light-emitting diodes (OLEDs) based on electroluminescence are being developed as a promising candidate for the next-generation computer and TV displays. Alongside the development of light sources, there have been continuous development and improvement in photodetectors, mainly in focal plane arrays, charge-coupled devices (CCDs), quantum well detectors, readout electronics, data transfer and processing, compact refrigeration and temperature control, and so forth. On the other hand, optical fibers have become an essential and rapidly growing technology in telecommunication and computer networks. The optical fiber technology for communication was developed in the 1970s along with the development of semiconductor lasers. In 1978, Nippon Telegraph and Telephone (NTT) demonstrated the transmission of 32 Mbps (million-bits-per-second) through 53 km of graded-index fiber at 1.3-m wavelength. By 2001, 3 1011 m of fiberoptic wires had been installed worldwide; this is a round-trip from the earth to the sun. In March 2006, NEC Corporation announced a 40-Gbps optical-fiber transmission system. Optical fibers have also been widely applied as sensors for biochemical detection as well as temperature and pressure measurements. Fiber drawing process involves complicated heat transfer and fluid dynamics at different length scales and temperatures.19 Nanophotonics is an emerging frontier that integrates photonics with physics, chemistry, biology, materials science, manufacture, and nanotechnology. The foundation of nanophotonics is to study interactions between light and matter, to explore the unique characteristics of nanostructures for utilizing light energy, and to develop novel nanofabrication and sensing techniques. Recent studies have focused on photonic crystals, nanocrystals, plasmonic waveguides, nanofabrication and nanolithography, light interaction with organic materials, biophotonics, biosensors, quantum electrodynamics, nanocavities, quantum dot and quantum wire lasers, solar cells, and so forth. Readers are referred to Prasad20 for an extensive discussion of the recent developments. In the field of thermal radiation, nanoscale radiative transfer and properties have become an active research area, and a special issue of the Journal of Heat Transfer is devoted to this exciting area.21 1.3.3 Microfabrication and Nanofabrication Richard Feynman, one of the best theoretical physicists of his time and a Nobel Laureate in Physics, delivered a visionary speech at Caltech in December 1959, entitled “There’s plenty of room at the bottom.” At that time, lasers had never existed and integrated circuits had just been invented and were not practically useful, and a single computer that is not as fast as a present-day handheld calculator would occupy a whole classroom with enormous heat generation. Feynman envisioned the future of controlling and manipulating things on very small scales, such as writing (with an electron beam) the whole 24 volumes of Encyclopedia Britannica on the head of a pin and rearranging atoms one at a time.22 Many of the things Feynman predicted were once considered scientific fictions or jokes but have been realized in practice by now, especially since the 1980s. In 1983, Feynman gave a second talk about the use of swimming machine as a medical device: the surgeon that you could swallow, as well as quantum computing.22 In the 1990s, micromachining and MEMS

INTRODUCTION

11

emerged as an active research area, with a great success by the commercialization of the micromachined accelerometers in the automobile airbag. Using the etching and lithographic techniques, engineers were able to manufacture microscopic machines with moving parts, as shown in Fig. 1.5, such as gears with a size less than the cross-section of human hair.

FIGURE 1.5 MEMS structures. (a) A dust mite on a microfabricated mirror assembly, where the gears are smaller than the thickness of human hair. (b) Drive gear chain with linkages, where coagulated red blood cells are on the upper left and the lower right and a grain of pollen is on the upper right. (Courtesy of Sandia National Laboratories, SUMMiT Technologies, www.mems.sandia.gov.)

The technologies used in microfabrication have been extensively discussed in the text of Madou.23 These MEMS devices were later developed as tools for biological and medical diagnostics, such as the so-called lab-on-a-chip, with pump, valve, and analysis sections on the 10 to 100 m scale. In aerospace engineering, an application is to build micro–air vehicles or microflyers, with sizes ranging from a human hand down to a bumblebee that could be used for surveillance and reconnaissance under extreme conditions. Microchannels and microscale heat pipes have also been developed and tested for electronic cooling applications. The study of microfluidics has naturally become an active research area in mechanical engineering. The development of SPM and MEMS technologies, together with materials development through self-assembly and other technologies, lead to further development of even smaller structures and the bottom-up approach of nanotechnology. Laserbased manufacturing, focused ion beam (FIB), and electron-beam lithography have also been developed to facilitate nanomanufacturing. In NEMS, quantum behavior becomes important and quantum mechanics is inevitable in understanding the behavior. Robert Curl, Harold Kroto, and Richard Smalley were winners of the Nobel Prize in Chemistry in 1996 for their discovery of fullerenes in 1985 at Rice University, during a period Kroto visited from University of Sussex. The group used pulsed laser irradiation to vaporize graphite and form carbon plasma in a pressurized helium gas stream. The result as diagnosed by time-of-flight mass spectroscopy suggested that self-assembled C60 molecules were formed and would be shaped like a soccer ball with 60 vertices made of the 60 carbon atoms; see Kroto et al., Nature, 318, 162 (1985). The results were confirmed later to be C60 molecules indeed with a diameter on the order of 1 nm with wave-particle duality. This type of carbon allotrope is called a buckminsterfullerene, or fullerene, or buckyball, after the famous architect Buckminster Fuller (1895–1983) who designed geodesic domes. In 1991, Sumio Iijima of NEC Corporation synthesized carbon nanotubes (CNTs) using

12

CHAPTER 1

arc discharge. Soon his group and an IBM group were able to produce single-walled carbon nanotubes (SWNTs) with a diameter on the order of 1 nm. There have been intensive studies of CNTs for hydrogen storage, nanotransistors, field emission, light emission and absorption, quantum conductance, nanocomposites, and high thermal conductivity. Figure 1.6a shows CNTs growth at a room-temperature environment by chemical vapor deposition on a heated cantilever tip with a size around 5 m.24 Figure 1.6b shows the synthesized SWNTs with encapsulated metallofullerenes of Gd:C82 (i.e., a gadolinium inside a fullerene molecule). The high-resolution transmission electron microscope (TEM) image suggests that the diameter of the SWNT is from 1.4 to 1.5 nm.25 It should be noted that electron microscopes, including SEM and TEM, have become a powerful tool for imaging micro/nanoscale objects with a magnification up to 2 million. The first electron microscope was built by Ernst Ruska and Max Knoll in Germany during the early 1930s, and Ruska shared the Nobel Prize in Physics in 1986 for his contributions to electron optics and microscopy. Various nanostructured materials have been synthesized, such as silicon nanowires, InAs/GaAs QDs, and Ag nanorods. Figure 1.6c shows some images for nanohelices or nanosprings made of ZnO nanobelts or nanoribbons using a solid-vapor process.26 These self-assembled structures under controlled conditions could be fundamental to the study of electromagnetic coupled nanodevices for use as sensors and actuators, as well as the growth dynamics at the nanoscale. One of the successful technologies that operate in the regime of quantum mechanical domain is the giant magnetoresistive (GMR) head and hard drive. The GMR head is based on ferromagnetic layers separated by an extremely thin (about 1 nm) nonferromagnetic spacer, such as Fe/Cr/Fe and Co/Cu/Co. MBE enabled the metallic film growth with required precision and quality. The electrical resistance of GMR materials depends strongly on the applied magnetic field, which affects the spin states of electrons. IBM first introduced this technology in 1996, which was only about 10 years after the publication of the original research results; see Grünberg et al., Phys. Rev. Lett., 57, 2442 (1986); Baibich et al., Phys. Rev. Lett., 61, 2472 (1988). GMR materials have been extensively used in computer hard drive and read/write head. Overheating, due to friction with the disk surface, can render the data unreadable for a short period until the head temperature stabilizes; such an effect is called thermal asperity. Yang et al. performed a detailed thermal characterization of Cu/CoFe superlattices for GMR head applications using MEMS-based thermal metrology tools.27

1.3.4 Probing and Manipulation of Small Structures Tunneling by elementary particles is a quantum mechanical phenomenon or wavelike behavior. It refers to a potential barrier of the particles that normally will confine the particles to either side of the barrier, like a mountain that is so high as to separate people on one side from those on the other. When the barrier thickness is thin enough, quantum tunneling can occur and particles can transmit through the barrier, as if a tunnel is dug through a mountain. An example is an insulator between two metal strips. Trained in mechanical engineering, Ivar Giaever performed the first tunneling experiment with superconductors in 1960 at the General Electric Research Laboratory and received the 1973 Nobel Prize in Physics, together with Leo Esaki of IBM and Brian Josephson. Esaki made significant contributions in semiconductor tunneling, superlattices, and the development of MBE technology. He invented a tunneling diode, called the Esaki diode, which is capable of very fast operation in the microwave region. Josephson further developed the tunneling theory and a device, called a Josephson junction, which is used in the superconducting quantum interface devices (SQUIDs), for measuring extremely small magnetic fields. SQUIDs are used in magnetic resonance imaging (MRI) for medical diagnostics.

INTRODUCTION

13

FIGURE 1.6 Examples of nanostructures. (a) SEM image of CNTs grown on heated cantilever tip. (Reprinted with permission from Sunden et al.,24 copyright 2006, American Institute of Physics.) (b) Buckyballs inside a SWNT (the lower is a TEM image in which the nanotube diameter is 1.4 to 1.5 nm). (Reprinted with permission from Hirahara et al.,25 copyright 2000, American Physical Society.) (c) TEM images of ZnO nanobelts that are coiled into nanohelices or nanosprings. [Reprinted with permission from Gao et al.,26 copyright 2005, AAAS (image courtesy of Prof. Z. L. Wang, Georgia Tech).]

14

CHAPTER 1

In 1981, Gerd Binnig and Heinrich Rohrer of IBM Zurich Research Laboratory developed the first scanning tunneling microscope (STM) based on electron tunneling through vacuum. This invention has enabled the detection and manipulation of surface phenomena at the atomic level and, thus, has largely shaped the nanoscale science and technology through further development of similar instrumentation. Binnig and Rohrer shared the Nobel Prize in Physics in 1986, along with Ruska who developed the first electron microscope as mentioned earlier. STM uses a sharp-stylus-probe tip and piezoelectricity for motion control. When the tip is near 1 nm from the surface, an electron can tunnel through the tip to the conductive substrate. The tunneling current is very sensitive to the gap. Therefore, by maintaining the tip in position and scanning the substrate in the x-y direction with a constant current (or distance), the height variation can be obtained with extremely good resolution (0.02 nm). Using STM, Binnig et al. soon obtained the real-space reconstruction of the 7 7 unit cells of Si(111).28 In 1993, another group at IBM Almaden Research Center was able to manipulate iron atoms to create a 48-atom quantum corral on a copper substrate.29 The images have appeared in the front cover of many magazines, including Science and Physics Today. STM can also be used to assemble organic molecules and to study DNA molecules.2 In 1996, Gerd Binnig, Calvin Quate, and Christoph Gerber developed another type of SPM, i.e., the atomic force microscope (AFM) that can operate without a vacuum environment and for electrical insulators.30 AFM uses a tapered tip at the end of a cantilever and an optical position sensor, as shown in Fig. 1.7. The position sensor is very sensitive to the

FIGURE 1.7 Schematic of an atomic force microscope (AFM).

bending of the cantilever (with a 0.1-nm vertical resolution). When the tip is brought close to the surface, there exist intermolecular forces (repulsive or attractive) between the tip and the atoms on the underneath surface. In the contact mode, the cantilever is maintained in position using the servo signal from the position-sensing diode to adjust the height of the sample, while it scans in the lateral direction. Surface topographic data can be obtained in

INTRODUCTION

15

an ambient environment for nonconductive materials. Other SPMs have also been developed and the family of SPMs is quite large today. Wickramasinghe and coworkers first investigated thermal probing by attaching a thermocouple to the cantilever tip.31 Later, Arun Majumdar’s group developed several types of scanning thermal microscope (SThM) for nanoscale thermal imaging of heated samples, including microelectronic devices and nanotubes.32 Recently, researchers have modified SThM for measuring and mapping thermoelectric power at nanoscales.33 Because of its simplicity, AFM has become one of the most versatile tools in nanoscale research, including friction measurements, nanoscale indentation, dip-pen nanolithography, and so forth. Heated cantilever tips were proposed for nanoscale indentation or writing on the polymethyl methacrylate (PMMA) surface, either using a laser or by heating the cantilever legs.34 The method was further developed to concentrate the heat dissipation to the tip by using heavily doped legs as electrical leads, resulting in writing (with a density near 500 Gb/in2) and erasing (with a density near 400 Gb/in2) capabilities. The temperature signal measured by the tip resistance can also be used to read the stored data due to the difference in heat loss as the tip scans the area.35 In an effort to improve the data-writing speed, IBM initiated the “millipede” project in 2000 and succeeded in making 32 32 heated-cantilever array for which each cantilever was separately controlled.36 Obviously, heat transfer and mechanical characteristics are at the center of these systems. The heated AFM cantilever tips have been used as a local heating source for a number of applications, including the abovementioned CVD growth of CNTs locally and thermal dip-pen nanolithography.37

1.3.5 Energy Conversion Devices Nanostructures may have unique thermal properties that can be used to facilitate heat transfer for heat removal and thermal management applications. An example was mentioned earlier to utilize nanotubes with high thermal conductivity, although nanotube bundles often suffer from interface resistance and phonon scattering by defects and boundaries. Recently, there have been a number of studies on nanofluids, which are liquids with suspensions of nanostructured solid materials, such as nanoparticles, nanofibers, and nanotubes.38 Enhanced thermal conductivity and increased heat flux have been demonstrated; however, the mechanisms that contributed to the enhancement and temperature dependence are still being debated.39 Thermoelectricity utilizes the irreversible thermodynamics principle for thermal-electrical conversion and can be used for cooling in microelectronics as well as miniaturized power generation. A critical issue is to enhance the figure of merit of performance, with a reduced thermal conductivity. Multilayer heterogeneous structures create heat barriers due to size effects and the boundary resistance. These structures have been extensively studied in the literature and demonstrate enhanced performances. Understanding the thermal and electrical properties of heterogeneous structures is critically important for future design and advancement.40 Fast-depleting reserves of conventional energy sources have resulted in an urgent need for increasing energy conversion efficiencies and recycling of waste heat. One of the potential candidates for fulfilling these requirements is thermophotovoltaic devices, which generate electricity from either the complete combustion of different fuels or the waste heat of other energy sources, thereby saving energy. The thermal radiation from the emitter is incident on a photovoltaic cell, which generates electrical currents. Applications of such devices range from hybrid electric vehicles to power sources for microelectronic systems. At present, thermophotovoltaic systems suffer from low conversion efficiency. Nanostructures have been extensively used to engineer surfaces with designed absorption, reflection, and emission characteristics. Moreover, at the nanoscale, the radiative energy transfer can be greatly enhanced due to tunneling and enhanced local density of states. A viable solution to

CHAPTER 1

16

increase the thermophotovoltaic efficiency is to apply microscale radiation principles in the design of different components to utilize the characteristics of thermal radiation at small distances and in microstructures.41 Nanostructures can also help increase the energy conversion efficiency and reduce the cost of solar cells. Figure 1.8 shows the device structure of a Platinized electrode Dye-coated nanowire array in electrodyte

Transparent electrode

e−

(a)

5 µm (b) FIGURE 1.8 ZnO nanowires for dye-sensitized solar cells, from Law et al.42 (Reprinted by permission from Macmillan Publishers Ltd.: Nature Materials, copyright 2005.) The height of the wires is near 16 m and their diameters vary between 130 and 200 nm. (a) Schematic of the cell with light incident through the bottom electrode. (b) SEM image of a cleaved nanowire array.

ZnO-nanowire array for dye-sensitized solar cells.42 This structure can greatly enhance the absorption or quantum efficiency over nanoparticle-based films. Knowledge of the spectral and directional absorptance of nanostructures and heat dissipation mechanisms is critically important for further advancement of this type of device.43 Hydrogen technologies are being considered and actively pursued as the energy source of the future. There are two ways in which hydrogen H2 may be used: one is in a combustion heat engine where hydrogen reacts with oxygen intensively while releasing heat; the other is in a fuel cell where electrochemical reaction occurs quietly to generate electricity just like a battery. Because the only reaction product is water, hydrogen-powered automobiles can be made pollution free in principle. Grand challenges exist in generation, storage,

INTRODUCTION

17

and transport of hydrogen. If all hydrogen is obtained from fossil fuels, there will be no reduction in either the fossil fuel consumption or the carbon dioxide emission, except that the emission is centralized in the hydrogen production plant. Alternatively, hydrogen may be produced from water with other energy sources, such as renewable energy sources. Nanomaterials are being developed for several key issues related to hydrogen technologies, such as hydrogen storage using nanoporous materials, effective hydrogen generation by harvesting solar energy with inexpensive photovoltaic materials, and fuel cells based on nanostructure catalysts .44

1.3.6 Biomolecule Imaging and Molecular Electronics Optical microscopy has played an instrumental role in medical diagnoses because it allows us to see bacteria and blood cells. Optical wavelength is more desirable than x-ray or electron beam because of the less invasiveness and the more convenience. However, the resolution of a traditional microscope is on the order of half the wavelength due to the diffraction limit. While the concept of near-field imaging existed in the literature before 1930, it has been largely forgotten because of the inability in building the structures and controlling their motion. With the microfabrication and precision-positioning capabilities, near-field scanning optical microscopes (NSOMs, also called SNOMs) were realized in the early 1980s by different groups and extensively used for biomolecule imaging with a resolution of 20 to 50 nm.45 The principle is to bring the light through an aperture of a tapered fiber of very small diameter at the end or to bring the light through an aperture of very small diameter. The beam out from the fiber tip or aperture will diverge quickly if the sample is placed in the far field, i.e., away from the aperture. However, high resolution can be achieved by placing the sample in close proximity to the aperture within a distance much less than the wavelength, i.e., in the near filed, such that the beam size is almost the same as the aperture. An apertureless metallic tip can be integrated with an SPM to guide the electromagnetic wave via surface plasmon resonance with a spatial resolution as high as 10 nm, for high-resolution imaging and processing. There have since been extensive studies on near-field interactions between electromagnetic waves and nanostructured materials, from semiconductor QDs, metallic nanoaperture and nanohole arrays, to DNA and RNA structures. Nanoparticles are among the earliest known nanostructures that have been used for centuries in making stained glass with gold or other metallic nanoparticles as well as photographic films with silver nanoparticles. A QD has a spherical core encapsulated in a shell made of another semiconductor material, such as a CdSe core in a ZnS shell. The outer shell is only several monolayers thick, and the diameters of QDs range from 2 to 10 nm. The material for the inner core has a smaller bandgap. Quantum confinement in the core results in sizedependent fluorescent properties. Compared with molecular dyes conventionally used for fluorescent labeling in cellular imaging, the emission from QD fluorophores is brighter with a narrower spectral width. QDs also allow excitation at shorter wavelengths, making it easier to separate the fluorescent signal from the scattered one, and are resistive to photobleaching that causes dyes to lose fluorescence. Furthermore, the emission wavelength can be selected by varying the core size of QDs to provide multicolor labeling. It was first demonstrated in 1998 that QDs could be conjugated to biomolecules such as antibodies, peptides, and DNAs, enabling surface passivation and water solubility. In recent years, significant development has been made to employ QDs for in vivo and in vitro imaging, labeling, and sensing.46,47 CMOS technology is a top-down semiconductor fabrication process, in which patterns are created by first making a mask and then printing the desired features onto the surface of the wafer via lithography. Integrated circuits have dominated the technological and economic progress in the past 30 years, and complex and high-density devices have been manufactured on silicon wafers. However, this technology is going to reach a limit in 10 to 15 years, when the smallest feature size is less than 10 nm. Molecular electronics is considered

18

CHAPTER 1

as a promising alternative.48 A 3-D assembly with short interconnect distances would greatly increase the information storage density and transfer speed with reduced power consumption and amount of heat dissipated. Self-assembly means naturally occurring processes, from biological growth to the galaxy formation. In materials synthesis, selfassembly implies that the end products or structures are formed under favorable conditions and environments. An example is the growth of bulk crystals from a seed. Fullerenes and nanotubes are formed by self-assembling, not by slicing a graphite piece and then rolling and bending it to the shape of a tube or a shell. Self-assembly is referred to as a bottom-up process, like constructing an airplane model with LEGO pieces. Biological systems rely on self-assembly and self-replication to develop. Since 2000, CNT-based transistors have been built by several groups and found to be able to outperform Si-based ones. Transistors have also been created using a single molecule of a transition-metal organic complex nanobridge between two electrodes.49 Because of the small dimensions, quantum mechanics should govern the electrical and mechanical behaviors. Figure 1.9 illustrates an engineered

FIGURE 1.9 An engineered DNA strand between metal-atom contacts that could function as a molecular electronics device. (Courtesy of NASA Ames Center of Nanotechnology, http://ipt.arc.nasa.gov.)

DNA strand between metallic atoms, noting that the width of a DNA strand is around 2 nm. Such a structure could function as a sensor and other electronic components. Molecular electronics, while at its infancy, is expected to revolutionize electronics industry and to enable continuous technological progress through the twenty-first century.

INTRODUCTION

19

Nano/microscale research and discoveries have been instrumental to the development of technologies used today in microelectronics, photonics, communication, manufacture, and biomedicine. However, systematic and large-scale government investment toward nanoscience and engineering did not start until late 1990s, when the Interagency Working Group on Nanoscience, Engineering, and Technology (IWGN) was formed under the National Science and Technology Council (NSTC). The first report was released in fall 1999, entitled “Nanostructure Science and Technology,” followed by the report, “Nanotechnology Research Directions.” In July 2000, NSTC published the “National Nanotechnology Initiative (NNI).” A large number of nanotechnology centers and nanofabrication facilities have been established since then; see www.nano.gov. In the United States, the government spending on nanotechnology R&D exceeded $1 billion in 2005, as compared to $464 million in 2001 and approximately $116 million in 1997. The total government investment worldwide was over $4 billion in 2005, and Japan and European countries invested similar amount of money as the United States did. Recognizing the increasing impact on engineering and science, the American Society of Mechanical Engineers established the ASME Nanotechnology Institute in mid-2001 and sponsored a large number of international conferences and workshops; see http://nano.asme.org/. Understanding the thermal transport and properties at the nanoscale is extremely important as mentioned earlier. Engineers have the responsibility to transfer the basic science findings into technological advances, to design and develop better materials with desired functions, to build systems that integrate from small to large scales, to perform realistic modeling and simulation that facilitate practical realization of improved performance and continuously reduced cost, and to conduct quantitative measurements and tests that determine the materials properties and system performance. Like any other technology, nanotechnology may also have some adverse effects, such as toxic products and biochemical hazards, which are harmful to human health and the environment. There are also issues and debates concerning security, ethics, and religion. Government and industry standard organizations, as well as universities, have paid great attention to the societal implications and education issues in recent years. Optimists believe that we can harness nanobiotechnology to improve the quality of human life and benefit social progress, while overcoming the adverse effects, like we have done with electricity, chemical plants, and space technology.

1.4 OBJECTIVES AND ORGANIZATION OF THIS BOOK Scientists, engineers, entrepreneurs, and lawmakers must work together for the research outcomes to be transferred into practical products that will advance the technology and benefit society. Nanotechnology is still in the early stage and holds tremendous potential; therefore, it is important to educate a large number of engineers with a solid background in nanoscale analysis and design so that they will become tomorrow’s leaders and inventors. There is a growing demand of educating mechanical engineering students at both the graduate and undergraduate levels with a background in thermal transport at micro/nanoscales. Micro/nanoscale heat transfer courses have been introduced in a number of universities; however, most of these courses are limited at the graduate level. While an edited book on Microscale Energy Transport has been available since 1998,3 it is difficult to use as a textbook due to the lack of examples, homework problems, and sufficient details on each subject. Some universities have introduced nanotechnology-related courses to the freshmen and sophomores, with no in-depth coverage on the fundamentals of physics. A large number of institutions have introduced joint mechanical-electrical engineering courses on MEMS/NEMS, with a focus on device-level manufacturing and processing technology. To understand the thermal transport phenomena and thermophysical properties at small length

20

CHAPTER 1

scales, the concepts of quantum mechanics, solid state physics, and electrodynamics are inevitable. These concepts, however, are difficult to comprehend by engineering students. The aim of this book is to introduce the much needed physics knowledge without overwhelming mathematical operators or notions that are unfamiliar to engineering students. Therefore, this book can be used as the textbook not only in a graduate-level course but also in a tech elective for senior engineering undergraduates. While the book contains numerous equations, the math requirement mostly does not exceed engineering calculus including series, differential and integral equations, and some vector and matrix algebra. The reason to include such a large number of equations is to provide necessary derivation steps, so that readers can follow and understand clearly. This is particularly helpful for practicing engineers who do not have a large number of references at hand. The emphasis of this book is placed on the fundamental understanding of the phenomena and properties: that is, why do we need particular equations and how can we apply them to solve thermal transport problems at the prescribed length and time scales? Selected and refined examples are provided that are both practical and illustrative. At the end of each of the remaining nine chapters, a large number of exercises are given at various levels of complexity and difficulty. Numerical methods are not presented in this book. Most of the problems can be solved with a personal computer using a typical software program or spreadsheet. For course instructors, the solutions of many homework problems can be obtained from the author. The field of micro/nanoscale heat transfer was cultivated and fostered by Professor Chang-Lin Tien beginning in the late 1980s, along with the rapid development in microelectronics, MEMS, and nanotechnology. His long-lasting and legendary contributions to the thermal science research have been summarized in the recent volume of Annual Review of Heat Transfer.50 As early as in the 1960s, Professor Tien investigated the fundamentals of the radiative properties of gas molecules, the size effect on the thermal conductivity of thin films and wires, and radiation tunneling between closely spaced surfaces. He published (with John H. Lienhard) a book in 1971, titled Statistical Thermodynamics, which provided inspiring discussions on early quantum mechanics and models of thermal properties of gases, liquids, and crystalline solids. While thermodynamics is a required course for mechanical engineering students, the principles of thermodynamics cannot be understood without a detailed background in statistical thermodynamics. Statistical mechanics and kinetic theory are also critical for understanding thermal properties and transport phenomena. Chapter 2 provides an overview of equilibrium thermodynamics, heat transfer, and fluid mechanics. Built up from the undergraduate mechanical engineering curricula, the materials are introduced in a quite different sequence to emphasize thermal equilibrium, the second law of thermodynamics, and thermodynamic relations. The concept of entropy is rigorously defined and applied to analyze conduction and convection heat transfer problems in this chapter. It should be noted that, in Chap. 8, an extensive discussion is given on the entropy of radiation. Chapter 3 introduces statistical mechanics and derives the classical (Maxwell-Boltzmann) statistics and quantum (i.e., Bose-Einstein and Fermi-Dirac) statistics. The first, second, and third laws of thermodynamics are presented with a microscopic interpretation, leading to the discussion of Bose-Einstein condensate and laser cooling of atoms. The classical statistics are extensively used to obtain the ideal gas equation, the velocity distribution, and the specific heat. A concise presentation of elementary quantum mechanics is then provided. This will help students gain a deep understanding of the earlier parts of this chapter. For example, the quantization of energy levels and the energy storage mechanisms by translation, rotation, and vibration for modeling the specific heat of ideal polyatomic gases. The combined knowledge of quantum mechanics and statistical thermodynamics is important for subsequent studies. The concept of photon as an elementary particle and how it interacts with an atom are discussed according to Einstein’s 1917 paper on the atomic absorption and emission mechanisms. Finally, the special theory of relativity is briefly introduced to help understand the limitation of mass conservation and the generality of the law of energy conservation.

INTRODUCTION

21

Chapter 4 begins with a very basic kinetic theory of dilute gases and provides a microscopic understanding of pressure and shear. With the help of mean free path and average collision distance, the transport coefficients such as viscosity, thermal conductivity, and mass diffusion coefficient are described. Following a discussion of intermolecular forces, the detailed Boltzmann transport equation (BTE) is presented to fully describe hydrodynamic equations as well as Fourier’s law of heat conduction, under appropriate approximations. In the next section, the regimes of microflow are described based on the Knudsen number, and the current methods to deal with microfluidics are summarized. The heat transfer associated with slip flow and temperature jump is presented in more detail with a simple planar geometry. Then, gas conduction between two surfaces under free molecular flow is derived. These examples, while simple, capture some of the basics of microfluidics. No further discussion is given on properties of liquids or multiphase fluids. It should be noted that several books on microflow already exist in the literature. The next three chapters provide a comprehensive treatment of nano/microscale heat transfer in solids, with an emphasis on the physical phenomena as well as material properties. The materials covered in Chap. 5 are based on simple free-electron model, kinetic theory, and BTE without a detailed background of solid state physics, which is discussed afterward in Chap. 6. This not only helps students comprehend the basic, underlying physical mechanisms but also allows the instructor to integrate Chap. 5 into a graduate heat conduction course. For an undergraduate elective, Chap. 6 can be considered as reading material or reference without spending too much time going through the details in class. In Chap. 5, the theory of specific heat is presented with a detailed treatment on the quantum size effect. Similarly, the theory of thermal conductivity of metals and dielectric solids is introduced. Because of the direct relation between electrical and thermal conductivities and the importance of thermoelectric effects, irreversible thermodynamics and thermoelectricity are also introduced. The classical size effect on thermal conductivity due to boundary scattering is elaborated. Finally, the concept of quantum conductance (both electric and thermal) is introduced. Chapter 6 introduces the electronic band structures and phonon dispersion relations in solids. It helps understand semiconductor physics and some of the difficulties of free-electron model for metals. Photoemission, thermionic emission, and electron tunneling phenomena are introduced. The electrical transport in semiconductors is described with applications in energy conversion and optoelectronic devices. Chapter 7 focuses on nonequilibrium energy transport in nanostructures, including non-Fourier equations for transient heat conduction. The equation of phonon radiative transfer is presented and solved for thin-film and multilayer structures. The phenomenon of thermal boundary resistance is studied microscopically. A regime map is developed in terms of the length scale and the time scale from macroscale to microscale to nanoscale heat conduction. Additional reading materials regarding multiscale modeling, atomistic modeling, and thermal metrology are provided as references. The last three chapters give comprehensive discussion on nano/microscale radiation with extensive background on the fundamentals of electromagnetic waves, the optical and thermal radiative properties of materials and surfaces, and the recent advancement in nanophotonics and nanoscale radiative transfer. Chapter 8 presents the Maxwell equations of electromagnetic waves and the derivation of Planck’s law and radiation entropy. The electric and magnetic properties of the newly developed class of materials, i.e., negativerefractive-index materials are also discussed. More extensive discussion of the radiative properties of thin films, gratings, and rough surfaces is given in Chap. 9. The wave interference, partial coherence, and diffraction phenomena are introduced with detailed formulations. In Chap. 10, attention is given to the evanescent wave, coupling and localization, surface plasmon polaritons, surface phonon polaritons, and near-field energy transfer. This chapter contains the most recent developments in near-field optics, nanophotonics, and nanoscale radiative transfer. These advancements will continue to impact on the energy conversion devices, sensors, and nanoscale photothermal manufacturing.

22

CHAPTER 1

It is noteworthy that the book Nanoscale Energy Transfer and Conversion, by Professor G. Chen, has recently been published.4 In his book, a parallel treatment is presented to deal with electron, molecule, phonon, and photon transport processes. Such a parallel treatment places emphasis on the similarity and analogy between different energy carriers and transport mechanisms. While the approaches are unique and interesting, it is difficult for use as a textbook at the entry level without some preliminary solid state physics and statistical thermodynamics background. The present book places materials within the context of each topic by presenting statistical thermodynamics, kinetic theory of ideal gases and microfluidics, electrons and phonons in solids, and electromagnetic waves and their interactions with nanomaterials in separate chapters. In addition to the differences in the organization and presentation, the coverage of the present text differs to some extent from Chen’s book. The present book contains much more extensive discussion on statistical thermodynamics and nanoscale thermal radiation, while Chen’s book includes additional chapters on liquids and their interfaces as well as molecular dynamics simulation. As a result, the two books complement each other in terms of the coverage and organization. It is hoped that the present text can be used either as a whole in a one-semester course, or in part for integration into an existing thermal science course for several weeks on a particular topic. Examples are graduate-level thermodynamics (Chaps. 2 and 3), convection heat transfer (Chap. 4), conduction heat transfer (Chaps. 5 and 7), and radiation heat transfer (Chaps. 8 and 9). Selected materials may also be used to introduce nanoscale thermal sciences in undergraduate heat transfer and fluid mechanics courses. Some universities offer a second course on thermodynamics at the undergraduate level for which statistical thermodynamics and quantum theory can also be introduced. This text can also be self-studied by researchers or practicing engineers, graduated from a traditional engineering discipline. A large effort is given to balance the depth with the breadth so that it is easy to understand and contains sufficient coverage of both the fundamentals and advanced developments in the field. Readers will gain the background necessary to understand the contemporary research in nano/microscale thermal engineering and to solve a variety of practical problems using the approaches presented in the text.

REFERENCES 1. C. P. Poole, Jr. and F. J. Owens, Introduction to Nanotechnology, Wiley, New York, 2003. 2. E. L. Wolf, Nanophysics and Nanotechnology—An Introduction to Modern Concepts in Nanoscience, Wiley-VCH, Weinheim, Germany, 2004. 3. C. L. Tien, A. Majumdar, and F. M. Gerner (eds.), Microscale Energy Transport, Taylor & Francis, Washington, DC, 1998. 4. G. Chen, Nanoscale Energy Transport and Conversion, Oxford University Press, New York, 2005. 5. C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., 27, 379–423, 623–656, July & October 1948. See http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html. 6. G. E. Moore, “Cramming more components onto integrated circuits,” Electronics, 38(8), 114–117, April 1965; G. E. Moore, “Progress in digital integrated electronics,” IEEE Tech. Digest (International Electron Devices Meeting), 11–13, 1975. www.intel.com/technology/mooreslaw. 7. B. K. Tsai, “A summary of lightpipe radiation thermometry research at NIST,” J. Res. Natl. Inst. Stand. Technol., 111, 9–30, 2006. 8. SIA, International Technology Roadmap for Semiconductors. www.itrs.net. 9. C. H. Fan and J. P. Longtin, “Radiative energy transport at the spatial and temporal micro/nano scales,” in Heat Transfer and Fluid Flow in Microscale and Nanoscale Structures, M. Faghri and B. Sunden (eds.), WIT Press, Southampton, UK, pp. 225–275, 2003. 10. W. Denk, J. H. Stricker, and W. W. Webb, “Two-photon laser scanning fluorescence microscopy,” Science, 248, 73–76, 1990. 11. T. Yu, C. K. Ober, S. M. Kuebler, W. Zhou, S. R. Marder, and J. W. Perry, “Chemicallyamplified positive resist system for two-photon three-dimensional lithography,” Adv. Mater, 15,

INTRODUCTION

12.

13. 14. 15.

16.

17. 18. 19.

20. 21. 22.

23. 24.

25.

26.

27.

28.

29.

23

517–521, 2003; S. M. Kuebler, K. L. Braun, W. Zhou, et al., “Design and application of highsensitivity two-photon initiators for three-dimensional microfabrication,” J. Photochem. Photobio. A: Chemistry, 158, 163–170, 2003. M. F. Modest and H. Abakians, “Heat-conduction in a moving semi-infinite solid subject to pulsed laser irradiation,” J. Heat Transfer, 108, 597–601, 1986; M. F. Modest and H. Abakians, “Evaporative cutting of a semi-infinite body with a moving cw laser,” J. Heat Transfer, 108, 602–607, 1986. C. L. Tien, T. Q. Qiu, and P. M. Norris, “Microscale thermal phenomena in contemporary technology,” Thermal Sci. Eng., 2, 1–11, 1994. R. J. Stoner and H. J. Maris, “Kapitza conductance and heat flow between solids at temperatures from 50 to 300 K,” Phys. Rev. B, 48, 16373–16387, 1993. W. S. Capinski, H. J. Maris, T. Ruf, M. Cardona, K. Ploog, and D. S. Katzer, “Thermal-conductivity measurements of GaAs/AlAs superlattices using a picosecond optical pump-and-probe technique,” Phys. Rev. B, 59, 8105–8113, 1999. P. M. Norris, A. P. Caffrey, R. Stevens, J. M. Klopf, J. T. McLeskey, and A. N. Smith, “Femtosecond pump-probe nondestructive evaluation of materials,” Rev. Sci. Instrum., 74, 400–406, 2003; R. J. Stevens, A. N. Smith, and P. M. Norris, “Measurement of thermal boundary conductance of a series of metal-dielectric interfaces by the transient thermoreflectance techniques,” J. Heat Transfer, 127, 315–322, 2005. O. Manasreh, Semiconductor Heterojunctions and Nanostructures, McGraw-Hill, New York, 2005. G. Chen, “Heat transfer in micro- and nanoscale photonic devices,” Annu. Rev. Heat Transfer, 7, 1–18, 1996. Y. Jaluria, “Thermal processing of materials: from basic research to engineering,” J. Heat Transfer, 125, 957–979, 2003; X. Cheng and Y. Jaluria, “Optimization of a thermal manufacturing process: drawing of optical fiber,” Int J. Heat Mass Transfer, 48, 3560–3573, 2005; C. Chen and Y. Jaluria, “Modeling of radiation heat transfer in the drawing of an optical fiber with multilayer structure,” J. Heat Transfer, 129, 342–352, 2007. P. N. Prasad, Nanophotonics, Wiley, New York, 2004. Z. M. Zhang and M. P. Menguç, “Guest editorial: special issue on nano/microscale radiative transfer,” J. Heat Transfer, 129, 1–2, 2007. R. P. Feynman, “There’s plenty of room at the bottom,” J. Microelectromechanical Systems, 1, 60–66, 1992; R.P. Feynman, “Infinitesimal Machinery,” J. Microelectromechanical Systems, 2, 4–14, 1993. www.zyvex.com/nanotech/feynman.html. M. J. Madou, Fundamentals of Microfabrication: The Science of Miniaturization, 2nd ed., CRC Press, Boca Raton, FL, 2002. E. O. Sunden, T. L. Wright, J. Lee, W. P. King, and S. Graham, “Room-temperature chemical vapor deposition and mass detection on a heated atomic force microscope cantilever,” Appl. Phys. Lett., 88, 033107, 2006. K. Hirahara, K. Suenaga, S. Bandow, et al., “One-dimensional metallofullerene crystal generated inside single-walled carbon nanotubes,” Phys. Rev. Lett., 85, 5384, 2000. Also see Phys. Rev. Focus, 19 December 2000 at http://focus.aps.org/story/v6/st27. P. X. Gao, Y. Ding, W. J. Mai, W. L. Hughes, C. S. Lao, and Z. L. Wang, “Conversion of zinc oxide nanobelt into superlattice-structured nanohelices,” Science, 309, 1700–1704, 2005; X. Y. Kong, Y. Ding, R. Yang, and Z. L. Wang, “Single-crystal nanorings formed by epitaxial selfcoiling of polar nanobelts,” Science, 309, 1348–1351, 2004. Y. Yang, W. Liu, and M. Asheghi, “Thermal and electrical characterization of Cu/CoFe superlattices,” Appl. Phys. Lett., 84, 3121–3123, 2004; Y. Yang, R. M. White, and M. Asheghi, “Thermal characterization of Cu/CoFe multilayer for giant magnetoresistive (GMR) head applications,” J. Heat Transfer, 128, 113–120, 2006. G. Binnig and H. Rohrer, “Scanning tunneling microscopy,” Helv. Phys. Acta, 55, 726–735, 1982; G. Binnig, H. Rohrer, Ch. Gerber, and E. Weibel, “Surface studies by scanning tunneling microscopy,” Phys. Rev. Lett., 49, 57–61, 1982; G. Binnig, H. Rohrer, Ch. Gerber, and E. Weibel, “7 7 reconstruction on Si(111) resolved in real space,” Phys. Rev. Lett., 50, 120–123, 1983. M. F. Crommie, C. P. Lutz, and D. M. Eigler, “Confinement of electrons to quantum corrals on a metal surface,” Science, 262, 218–220, 1993.

24

CHAPTER 1

30. G. Binnig, C. F. Quate, and Ch. Gerber, “Atomic force microscope,” Phys. Rev. Lett., 56, 930–933, 1986. 31. C. C. Williams and H. K. Wickramasinghe, “Scanning thermal profiler,” Appl. Phys. Lett., 49, 1587–89, 1986; J. M. R. Weaver, L. M. Walpita, and H. K. Wickramasinghe, “Optical absorption microscopy with nanometer resolution,” Nature, 342, 783–85, 1989; M. Nonnenmacher and H. K. Wickramasinghe, “Optical absorption spectroscopy by scanning force microscopy,” Ultramicroscopy, 42–44, 351–354, 1992. 32. A. Majumdar, “Scanning thermal microscopy,” Annu. Rev. Mater. Sci., 29, 505–585, 1999. 33. H.-K. Lyeo, A. A. Khajetoorians, L. Shi, et al., “Profiling the thermoelectric power of semiconductor junctions with nanometer resolution,” Science, 303, 818–820, 2004; Z. Bian, A. Shakouri, L. Shi, H.-K. Lyeo, and C. K. Shih, “Three-dimensional modeling of nanoscale Seebeck measurement by scanning thermoelectric microscopy,” Appl. Phys. Lett., 87, 053115, 2005. 34. H. J. Mamin and D. Rugar, “Thermomechanical writing with an atomic force microscope tip,” Appl. Phys. Lett., 61, 1003–1005, 1992; H. J. Mamin, “Thermal writing using a heated atomic force microscope tip,” Appl. Phys. Lett., 69, 433–435, 1996. 35. G. Binnig, M. Despont, U. Drechsler, et al., “Ultrahigh-density atomic force microscopy data storage with erase capability,” Appl. Phys. Lett., 74, 1329–1331, 1999; W. P. King, T. W. Kenny, K. E. Goodson, et al., “Atomic force microscope cantilevers for combined thermomechanical data writing and reading,” Appl. Phys. Lett., 78, 1300–1302, 2001. 36. U. Dürig, G. Cross, M. Despont, et al., “‘Millipede’—an AFM data storage system at the frontier of nanotechnology,” Tribology Lett., 9, 25–32, 2000; P. Vettiger, G. Cross, M. Despont, et al., “The ‘millipede’—nanotechnology entering data storage,” IEEE Trans. Nanotechnol., 1, 39–55, 2002. 37. P. E. Sheehan, L. J. Whitman, W. P. King, and B. A. Nelson, “Nanoscale deposition of solid inks via thermal dip pen nanolithography,” Appl. Phys. Lett., 85, 1589–1591, 2004. 38. J. A. Eastman, S. R. Phillpot, S. U. S. Choi, and P. Kablinski, “Thermal transport in nanofluids,” Annu. Rev. Mater. Res., 34, 219–246, 2004. 39. R. S. Prasher, P. Bhattacharya, and P. E. Phelan, “Thermal conductivity of nanoscale colloidal solutions (nanofluids),” Phys. Rev. Lett., 94, 025901, 2005; R. Prasher, P. Bhattacharya, and P. E. Phelan, “Brownian-motion-based convective-conductive model for the effective thermal conductivity of nanofluids,” J. Heat Transfer, 128, 588–595, 2006. 40. G. Chen and A. Shakouri, “Heat transfer in nanostructures for solid-state energy conversion,” J. Heat Transfer, 124, 242–252, 2002; H. Böttner, G. Chen, and R. Venkatasubramanian, “Aspects of thin-film superlattice thermoelectric materials, devices and applications,” MRS Bulletin, 31, 211–217, March 2006. 41. S. Basu, Y.-B. Chen, and Z. M. Zhang, “Microscale radiation in thermophotovoltaic devices—A review,” Int. J. Ener. Res., 31, in press, 2007. (Published online 6 Dec. 2006.) 42. M. Law, L. E. Greene, J. C. Johnson, R. Saykally, and P. Yang, “Nanowire dye-sensitized solar cells,” Nature Mater., 4, 455–459, 2005. 43. A. Mihi and H. Miguez, “Origin of light-harvesting enhancement in colloidal-photonic-crystal-based dye-sensitized solar cells,” J. Phys. Chem. B, 109, 15968–15976, 2005. 44. G. Crabtree, M. Dresselhaus, and M. Buchanan, “The hydrogen economy,” Physics Today, 39–44, December 2004. 45. A. Lewis, H. Taha, A. Strinkovski, et al., “Near-field optics: from subwavelength illumination to nanometric shadowing,” Nature Biotechnol., 21, 1378–1386, 2003. 46. X. Michalet, F. F. Pinaud, L. A. Bentolila, et al., “Quantum dots for live cells, in vivo imaging, and diagnostics,” Science, 307, 538–544, 2005. 47. I. L. Medintz, H. T. Uyeda, E. R. Goldman, and H. Mattoussi, “Quantum dot bioconjugates for imaging, labelling and sensing,” Nature Mater., 4, 435–446, 2005. 48. B. Yu and M. Meyyappan, “Nanotechnology: role in emerging nanoelectronics,” Solid-State Electronics, 50, 536–544, 2006. 49. S. De Franceschi and L. Kouwenhoven, “Electronics and the single atom,” Nature, 417, 701–702, 2002. 50. V. Prasad, Y. Jaluria, and G. Chen (eds.), Annual Review of Heat Transfer, Vol. 14, Begell House, New York, 2005.

CHAPTER 2

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

This chapter provides a concise description of the basic concepts and theories underlying classical thermodynamics and heat transfer. Different approaches exist in presenting the subject of thermodynamics. Most engineering textbooks first introduce temperature, then discuss energy, work, and heat, and define entropy afterward. Callen developed an axiomatic structure using a simple set of abstract postulates to combine the physical information that is included in the laws of thermodynamics.1 Continuing the effort pioneered by Keenan and Hatsopoulos,2 Gyftopoulos and Beretta3 developed a logical sequence to introduce the basic concepts with a rigorous definition of each thermodynamic term. Their book has been a great inspiration to the present author in comprehending and teaching thermodynamics. Here, an overview of classical thermodynamics is provided that is somewhat beyond typical undergraduate textbooks.4,5 Details on the historic development of classical thermodynamics can be found from Bejan6 and Kestin7, and references therein. The basic phenomena and governing equations in energy, mass, and momentum transfer will be presented subsequently in a self-consistent manner without invoking microscopic theories.

2.1 FUNDAMENTALS OF THERMODYNAMICS A system is a collection of constituents (whose amounts may be fixed or varied within a specified range) in a defined space (e.g., a container whose volume may be fixed or varied within a specified range), subject to other external forces (such as gravitational and magnetic forces) and constraints. External forces are characterized by parameters. An example is the volume of a container, which is a parameter associated with the forces that confine the constituents within a specified space. Everything that is not included in the system is called the environment or surroundings of the system. Quantities that characterize the behavior of a system at any instant of time are called properties of the system. Properties must be measurable and their values are independent of the measuring devices. Properties supplement constituents and parameters to fully characterize a system. At any given time, the system is said to be in a state, which is fully characterized by the types and amount of constituents, a set of parameters associated with various types of external forces, and a set of properties. Two states are identical if the amount of each type of constituents and values of all the parameters and properties are the same. A system may experience a spontaneous change of state, when the change of state does not involve any interaction between the system and its environment. If the

25 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

26

CHAPTER 2

system changes its state through interactions with other systems in the environment, it is said to experience an induced change of state. If a system can experience only spontaneous changes of state, it is said to be an isolated system, that is, the change of state of the system does not affect the environment of the system. The study of the possible and allowed states of a system is called kinematics, and the study of the time evolution of the state is called dynamics. The relation that describes the change of state of a system as a function of time is the equation of motion. In practice, the complete equations of motion are often not known. Therefore, in thermodynamics the description of the change of state is usually given in terms of the end states (i.e., the initial and final states) and the modes of interaction (for example, work and heat, which are discussed later). The end states and the modes of interaction specify a process. A spontaneous change of state is also called a spontaneous process. A process is reversible if there is at least one way to restore both the system and its environment to their initial states. Otherwise, the process is irreversible, i.e., it is not possible to restore both the system and its environment to their initial states. A steady state is one that does not change as a function of time despite interactions between the system and other systems in the environment.

2.1.1 The First Law of Thermodynamics Energy is a property of every system in any state. The first law of thermodynamics states that energy can be transferred to or from a system but can be neither created nor destroyed. The energy balance for a system can be expressed as

E E2 E1 Enet,in

(2.1a)

where denotes a finite change, subscripts 1 and 2 refer to the initial and final states, respectively, and Enet,in Ein Eout is the net amount of energy transferred into the system. For an infinitesimal change, the differential form of the energy balance is dE dEnet,in

(2.1b)

Here, d is used to signify a differential change of the property of a system, and d is used to specify a differentially small quantity that is not a property of any system. Clearly, the energy of an isolated system is conserved. Energy is an additive property, i.e., the energy of a composite system is the sum of the energies of all individual subsystems. Examples are kinetic energy and potential energy, as defined in classical mechanics, and internal energy, which will be discussed later. A similar expression for mass balance can also be written. The term mechanical effect is used for the kind of processes described in mechanics, such as the change of the height of a weight in a gravitational field, the change of the relative positions of two charged particles, the change of the velocity of a point mass, the change of the length of a spring, or a combination of such changes. All mechanical effects are equivalent in the sense that it is always possible to arrange forces and processes that annul all the mechanical effects except one that we choose. It is common to choose the rise and fall of a weight in a gravity field to represent this kind of processes. A cyclic process (also called a cycle) is one with identical initial and final states. A perpetual-motion machine of the first kind (PMM1) is any device (or system) undergoing a cyclic process that produces no external effects but the rise or fall of a weight in a gravity field. A PMM1 violates the first law of thermodynamics, and hence, it is impossible to build a PMM1. Perpetual motion, however, may exist as long as it produces zero net external effect. Examples of perpetual motion are a lossless oscillating pendulum, an electric current through a superconducting coil, and so forth.

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

27

2.1.2 Thermodynamic Equilibrium and the Second Law An equilibrium state is a state that cannot change spontaneously with time. There are different types of equilibrium: unstable, stable, and metastable. A stable-equilibrium state is a state that cannot be altered to a different state without leaving any net effect on the environment. In the following, a stable-equilibrium state is frequently referred to as a state at thermodynamic equilibrium. The stable-equilibrium-state principle, or state principle, can be phrased as follows: Among all states of a system with a given set of values of energy, parameters, and constituents, there exists one and only one stable-equilibrium state. That is to say that, in a stableequilibrium state, all properties are uniquely determined by the amount of energy, the value of each parameter, and the amount of each type of constituents. This principle is an integral part of the second law of thermodynamics.2,3,7 It is important for the thermodynamic definition of temperature and the derivation of thermodynamic relations in stable-equilibrium states. Another aspect of the second law of thermodynamics is the definition of an important property, called entropy, as discussed next. Entropy is an additive property of every system in any state. The second law of thermodynamics asserts that, in an isolated system, entropy cannot be destroyed but can either be created (in an irreversible process) or remain the same (in a reversible process). The entropy produced as time evolves during an irreversible process is called the entropy generation (Sgen ) of the process due to irreversibility. Like energy, entropy can be transferred from one system to another. One can write the entropy balance as follows (keeping in mind that entropy generation must not be negative):

S S2 S1 Snet,in Sgen with or with

Sgen 0

(2.2a)

dS dSnet,in dSgen dSgen 0

(2.2b)

Here again, d is used to indicate an infinitesimal quantity that is not a property of any system. For a system with fixed values of energy (E), parameters, and constituents, the entropy of the system is the largest in the stable-equilibrium state. This is the highest entropy principle. Applying this principle to an isolated system for which the energy is conserved, the entropy of the system will increase until a thermodynamic equilibrium is reached. Spontaneous changes of state are usually irreversible and accompanied by entropy generation. The second law of thermodynamics can be summarized with the following three statements: (1) There exists a unique stable-equilibrium state for any system with given values of energy, parameters, and constituents. (2) Entropy is an additive property, and for an isolated system, the entropy change must be nonnegative. (3) Among all states with the same values of energy, parameters, and constituents, the entropy of the stable-equilibrium state is the maximum. The energy of a system with volume (V) as its only parameter (neglecting other external forces) is called the internal energy (U). The state principle implies that there are r 2 (where r is the number of different constituents) independent variables that fully characterize a stable-equilibrium state of such a system. Therefore in a stable-equilibrium state, all properties are functions of r 2 independent variables. Since entropy is a property of the system, we have S S(U,V,N1,N2,c,Nr)

(2.3)

CHAPTER 2

28

where Ni is the number of particles of the ith species (or type of constituents). This function is continuous and differentiable, and furthermore, it is a monotonically increasing function of energy for fixed values of V and Nj’s.1,3,6 Equation (2.3) can be uniquely solved for U so that U U(S,V,N1,N2,c,Nr)

(2.4)

which is also continuous and admits partial derivatives of all orders. Each first order partial derivative of Eq. (2.3) or (2.4) represents a property of the stable-equilibrium state. For example, temperature and pressure are properties of a system at thermodynamic equilibrium. The (absolute) temperature is defined by T a

'U b 'S V,N

(2.5a)

j ’s

and the pressure is defined by P a

'U b 'V S,N

(2.5b) j ’s

The partial derivative with respect to the ith type of constituents defines its chemical potential of that species, mi a

'U b 'Ni S,V,N

(2.5c)

j ’s( j2i)

Equation (2.3) or (2.4) is called the fundamental relation for states at thermodynamic equilibrium. The differential form of Eq. (2.4) is the Gibbs relation: dU TdS PdV a midNi r

(2.6)

i1

where Eq. (2.5) has been used. The above equation may be rearranged into the form dS

r m i 1 P dU dV a dNi T T i1 T

(2.7)

Therefore, 'S 'S 1 P , and a b , a b T 'U V,N T 'V U,N j’s

j’s

mi 'S b a T 'Ni U,V,N

j ’s(

j2i)

(2.8)

An interaction between two systems that results in a transfer of energy without net exchanges of entropy and constituents is called a work interaction. The amount of energy transferred in such an interaction is called work (W). An interaction that has only mechanical effects is a work interaction, but a work interaction may involve nonmechanical effects. A process that involves only work interaction is called an adiabatic process. Another kind of a typical interaction is heat interaction, in which both energy and entropy are transferred without net exchanges of constituents and parameters between two systems. The amount of energy transferred in a heat interaction is called heat (Q). Furthermore, the amount of entropy transferred dS is equal to the amount of energy transferred dQ divided by the temperature TQ at which the heat interaction happens, i.e., dS dQ/TQ. If a system cannot exchange constituents with other systems, it is said to be a closed system; otherwise, it is an open system. Reversible processes are considered as the limiting cases of real processes, which are always accompanied by a certain amount of irreversibility. Such an ideal process is called

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

29

a quasi-equilibrium (or quasi-static) process, in which each stage can be made as close to thermodynamic equilibrium as possible if the movement is frictionless and very slow. In an ideal process, a finite amount of heat can be transferred reversibly from one system to another at a constant temperature. In practice, heat transfer can only happen when there is a temperature difference, and the process is always irreversible. A perpetual-motion machine of the second kind (PMM2) is a cyclic device that interacts with a system at thermodynamic equilibrium and produces no external effect other than the rise of a weight in a gravity field, without changing the values of parameters and the amounts of constituents of the system. Historically, there exist different statements of the second law of thermodynamics: The Kelvin-Planck statement of the second law is that it is impossible to build a PMM2. The Clausius statement of the second law is that it is not possible to construct a cyclic machine that will produce no effect other than the transfer of heat from a system at lower temperature to a system at higher temperature. These statements can be proved using the three statements of the second law of thermodynamics given earlier in this chapter. Example 2-1. Criteria for thermodynamic equilibrium. Consider a moveable piston (adiabatic and impermeable to matter) that separates a cylinder into two compartments (systems A and B), as shown in Fig. 2.1. We learned from mechanics that a mechanical equilibrium requires a balance of

A

B

FIGURE 2.1 Illustration of two systems that may exchange work, heat, and species.

forces on both sides of the piston, that is to say the pressure of system A must be the same as that of system B (i.e., PA PB). If the piston wall is made of materials that are diathermal (allowing heat transfer) and permeable to all species, under what conditions will the composite system C consisting of systems A and B be at stable equilibrium? Solution. Assume system C is isolated from other systems, and each of the subsystems A and B is at a thermodynamic equilibrium state, whose properties are solely determined by its internal energy, volume, and amount of constituents:UA, VA, Nj’s,A and UB, VB, Nj’s,B, respectively. There exist neighboring states for both subsystems with small differences in U, V, and Nj’s, but the values of the composite system must be conserved, i.e., dUA dUB, dVA dVB, and dNi,A dNi,B (i 1, 2, . . . r). The differential entropy of system C can be expressed as: dSC dSA dSB

r m r m PA PB i,A i,B 1 1 dUA

dVA a dNi,A

dUB

dVB a dNi,B TA TA TB TB i1 TA i1 TB

a

r mi,A mi,B PA PB 1 1 b dUA a b dVA a a b dNi,A TA TB TA TB T TB A i1

(2.9)

CHAPTER 2

30

If system C is in a stable-equilibrium state, its entropy is maximum and dSC 0. Since the values of dUA, dVA, and dNi,A are arbitrary, we must have PB 1 PA 1 , , and TA TB TA TB or

mi,A TA

mi,B TB

(i 1, 2,cr)

TA TB, PA PB, and mi,A mi,B (i 1, 2,cr)

(2.10)

These conditions correspond to thermal equilibrium, mechanical equilibrium, and chemical equilibrium, respectively. The combination forms the criteria for thermodynamic equilibrium. Discussion. In the case when the piston is diathermal but rigid and impermeable to matter, the entropy change of system C must be nonnegative, i.e., dSC dSA dSB a

1 1 bdUA 0 TA TB

(2.11)

The above expression implies that dUA 0 for TA TB, and dUA 0 for TA TB. Spontaneous heat transfer can occur only from regions of higher temperature to regions of lower temperature. This essentially proves the Clausius statement of the second law of thermodynamics.

The concept of thermal equilibrium provides the physical foundation for thermometry, which is the science of temperature measurement. The temperature of a system at a thermodynamic equilibrium state is measured through changes in resistance, length, volume, or other physical parameters of the sensing element used in the thermometer, which is brought to thermal equilibrium with the system. Based on the inclusive statement of the second law of thermodynamics given previously, it can be inferred that two systems are in thermal equilibrium with each other if they are separately in thermal equilibrium with a third system. This is sometimes referred to as the zeroth law of thermodynamics.6 The International Temperature Scale of 1990 (ITS-90) was adopted by the International Committee of Weights and Measures in 1989.8 The unit of thermodynamic temperature is kelvin (K), which is defined as 1/273.16 of the thermodynamic temperature of the triple point of water. The Celsius temperature is defined as the difference of the thermodynamic temperature and 273.15 K (the ice point). A difference of temperature may be expressed in either kelvins or degrees Celsius (°C). Although earlier attempts were made to define a temperature scale consistent with the original Celsius temperature scale (i.e., 0°C for the ice point and 100°C for the steam point), a 0.026°C departure arose from more accurate measurements of the steam point, as shown in Table 2.1.9 The steam point is therefore no longer

TABLE 2.1 Two-Phase Points and the Triple Point of Water Temperature

*

Ice point Triple point† Steam point‡ *

(K)

(°C)

273.15 273.16 373.124

0 0.01 99.974

Solid and liquid phases are in equilibrium at a pressure of 1 atm (101.325 kPa). Solid, liquid, and vapor phases are in equilibrium. ‡ Liquid and vapor phases are in equilibrium at 1 atm. †

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

31

used as a defining fixed point in the ITS-90. More accurate steam tables were developed in the 1990s. The ITS-90 defines 17 fixed points, which are determined by primary thermometry with standard uncertainties less than 0.002 K below 303 K and up to 0.05 K at the freezing point of copper ( [1 (1 e1)(1 e2)] sSBT 41 > [1/e1 1 > e2 1], and that from surface 2 to surface 1 is q 2S1 sSBT 24 > [1/e1 1/e2 1]. Subsequently, the net radiative flux from surface 1 to surface 2 is

qs12 qs1S 2 qs2S1

sSB(T 41 T 24) 1/e1 1/e2 1

(2.51)

Plugging in T1 873 K, T2 300 K, and other numerical values, we obtain q12 14,433 W/m2.

Gas emission, absorption, and scattering are important for atmospheric radiation and combustion. When radiation travels through a cloud of gas, some of the energy may be absorbed. The absorption of photons raises the energy levels of individual molecules. At sufficiently high temperatures, gas molecules may spontaneously lower their energy levels and emit photons. These changes in energy levels are called radiative transitions, which include bound-bound transitions (between nondissociated molecular states), bound-free transitions (between nondissociated and dissociated states), and free-free transitions (between dissociated states). Bound-free and free-free transitions usually occur at very high temperatures (greater than about 5000 K) and emit in the ultraviolet and visible regions. The most important transitions for radiative heat transfer are bound-bound transitions between vibrational energy levels coupled with rotational transitions. The photon energy (or frequency) must be exactly the same as the difference between two energy levels in order for the photon to be absorbed or emitted; therefore, the quantization of the energy levels results in discrete spectral lines for absorption and emission. The rotational lines superimposed on a vibrational line give a band of closely spaced spectral lines, called the vibration-rotation spectrum. Additional discussion will be given in Chap. 3 about quantized transitions in atoms and molecules. Particles can also scatter electromagnetic waves or photons, causing a change in the direction of propagation. In the early twentieth century, Gustav Mie developed a solution of Maxwell’s equations for scattering of electromagnetic waves by spherical particles, known as the Mie scattering theory which can be used to predict the scattering phase function. In the case when the particle sizes are small compared with the wavelength, the formulation reduces to the simple expression obtained earlier by Lord Rayleigh; and the phenomenon is called Rayleigh scattering, in which the scattering efficiency is inversely proportional to the wavelength to the fourth power. The wavelength-dependent characteristic of light scattering by small particles helps explain why the sky is blue and why the sun appears red at sunset. For spheres whose diameters are much greater than the wavelength, geometric optics can be applied by treating the surface as specular or diffuse.

CHAPTER 2

50

The spectral intensity in a participating medium, Il Il(j,,t), depends on the location (the coordinate ), its direction (the solid angle ), and time t. In a time interval dt, the beam travels from to d (d cdt), and the intensity is attenuated by absorption and outscattering, but enhanced by emission and in-scattering. The macroscopic description of the radiation intensity is known as the equation of radiative transfer (ERT).17 'Il sl 1 'Il c 't 'j alIb,l(T) (al sl)Il 4p 34p Il(j,r,t)l(r,)dr (2.52) where a and are the absorption and scattering coefficients, respectively, is the solid angle and direction of Il, and is the in-scattering solid angle and direction of Il(j,,t). Here, (,) is the scattering phase function ( 1 for isotropic scattering), which satisfies the equation: 14pl(r,)dr ; 4p. The right-hand side of Eq. (2.52) is composed of three terms: the first accounts for the contribution of emission (which depends on the local gas temperature T); the second is the attenuation by absorption and out-scattering; and the third is the contribution of in-scattering from all directions (solid angle 4) to the direction . Unless ultrafast laser pulses are involved, the transient term is negligible. The ERT for the steady state can be simplified as l 'Il(zl,) I (z ,r)l(r,)dr

Il(zl,) (1 l)Ib,l

4p 34p l l 'zl

(2.53)

where zl 10 (al sl)dj is the optical path length, and l sl/(al sl) is called the scattering albedo. This is an integro-differential equation, and its right-hand side is called the source function. The integration of the spectral intensity over all wavelengths and all directions gives the radiative heat flux. Unless the temperature field is prescribed, Eq. (2.53) is coupled with the heat conduction equation in a macroscopically stationary medium and the energy conservation equation in a fluid with convection. j

Analytical solutions of the ERT rarely exist for applications with multidimensional and nonhomogeneous media. Approximate models have been developed to deal with special types of problems, including Hottel’s zonal method, the differential and moment methods (often using the spherical harmonics approximation), and the discrete ordinates method. The statistical model using the Monte Carlo method is often used for complicated geometries and radiative properties.17 Analytical solutions can be obtained only for limited simple cases. Example 2-7. A gray, isothermal gas at a temperature Tg 3000 K occupies the space between two, large parallel blackbody surfaces. Surface 1 is heated to a temperature T1 1000 K, while surface 2 is maintained at a relatively low temperature by water cooling. It is desired to know the amount of heat that must be removed from surface 2. If the scattering is negligible, calculate the heat flux at surface 2 for alL 0.01, 0.1, 1, and 10, where L is the distance between the two surfaces. Solution. For a gray medium without scattering, Eq. (2.53) becomes (1/al)dI/dj I(j,) Ib(Tg), where u is the angle between j and x. With Ib(Tg) sSBTg4/p and I(0) Ib(T1) SBT 41/, the ERT can be integrated from x 0 to x L. The result is I(L,u) (sSB/p)T 41 exp(alL/cosu)

(sSB/p)T 4g [1 exp(alL/cosu)]. The radiative flux at x L can be obtained by integrating the intensity over the hemisphere, i.e., qs(alL) 3

2p

0

3

p/2 s

0

SB

4 4 4 alL/cos u D cos u sin ududf p CTg (Tg T1 )e

sSBTg4 2sSB(Tg4 T14 )E3(alL)

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

51

where E3(z) 10 ez>mdm is called the third exponential integral and can be numerically evaluated. The final results are tabulated as follows: 1

aL E3(aL) q (W/m2)

0.01 0.49 1.474 105

0.1 0.416 8.187 105

1 0.11 3.595 106

10 3.48 106 4.593 106

Discussion. In the optically thick limit (alL W 1), qs < sSBTg4, and all radiation leaving surface 1 will be absorbed by the gas before reaching surface 2. On the other hand, the heat flux is much greater than sSBT14 56.7 kW/m2 at alL 0.01. The gas absorption can be neglected in the optically thin limit; however, its emission contributes significantly to the radiative flux at surface 2. This is because the gas temperature is much higher than that of surface 1 and L/cos u can be much longer than L for large u values.

2.5 SUMMARY This chapter provided an overview of classical or equilibrium thermodynamics, derived following logical steps and on a general basis, as well as the functional relations and thermodynamic properties of simple systems and ideal pure substances. Built upon the foundations of thermodynamics, the basic heat transfer modes were elaborated in a coherent way. Entropy generation is inevitably associated with any heat transfer process. The connection between heat transfer and entropy generation, which has been omitted by most heat transfer textbooks, was also discussed. The introduction of thermal radiation not only covered most of the undergraduate-level materials but also presented some basic graduate-level materials. This chapter should serve as a bridge or a reference to the rest of the book, dealing with energy transfer processes in micro/nanosystems and/or from a microscopic viewpoint of macroscopic phenomena.

REFERENCES 1. H. B. Callen, Thermodynamics and an Introduction to Thermostatistics, 2nd ed., Wiley, New York, 1985. 2. G. N. Hatsopoulos and J. H. Keenan, Principles of General Thermodynamics, Wiley, New York, 1965; J. H. Keenan, Thermodynamics, Wiley, New York, 1941. 3. E. P. Gyftopoulos and G. P. Beretta, Thermodynamics: Foundations and Applications, Macmillan, New York, 1991; Also see the augmented edition, Dover Publications, New York, 2005. 4. R. E. Sonntag, C. Borgnakke, and G. J. van Wylen, Fundamentals of Thermodynamics, 5th ed., Wiley, New York, 1998. 5. M. J. Moran and H. N. Shapiro, Fundamentals of Engineering Thermodynamics, 4th ed., Wiley, New York, 2000. 6. A. Bejan, Advanced Engineering Thermodynamics, 2nd ed., Wiley, New York, 1997. 7. J. Kestin (ed.), The Second Law of Thermodynamics, Dowden, Hutchinson & Ross, Inc., Stroudsburg, PA, 1976. 8. H. Preston-Thomas, “The International Temperature Scale of 1990 (ITS-90),” Metrologia, 27, 3–10, 1990. 9. Z. M. Zhang, “Surface temperature measurement using optical techniques,” Annu. Rev. Heat Transfer, 11, 351–411, 2000. 10. M. Kaviany, Principles of Heat Transfer, Wiley, New York, 2002. 11. F. P. Incropera and D. P. DeWitt, Fundamentals of Heat and Mass Transfer, 5th ed., Wiley, New York, 2002. 12. M. N. Özi¸sik, Heat Conduction, 2nd ed., Wiley, New York, 1993.

CHAPTER 2

52

13. Y. S. Touloukian and C. Y. Ho (eds.), Thermophysical Properties of Matter—The TPRC Data Series (13 volumes compilation of data on thermal conductivity, specific heat, linear expansion coefficient, thermal diffusivity, and radiative properties), Plenum Press, New York, 1970–1977. 14. A. Bejan, Entropy Generation Minimization, CRC Press, Boca Raton, FL, 1996. 15. R. F. Barron, Cryogenic Heat Transfer, Taylor & Francis, Philadelphia, PA, 1999. 16. M. C. Potter and D. C. Wiggert, Mechanics of Fluids, Prentice Hall, New Jersey, 1991. 17. R. Siegel and J. R. Howell, Thermal Radiation Heat Transfer, 4th ed., Taylor & Francis, New York, 2002.

PROBLEMS Give examples of steady state. Give examples of thermodynamic equilibrium state. Give an example of spontaneous process. Is the growth of a plant a spontaneous process? Give an example of adiabatic process. 2.2. What is work? Describe an experiment that can measure the amount of work. What is heat? Describe an apparatus that can be used to measure heat. Are work and heat properties of a system? 2.3. Expand Eq. (2.1) and Eq. (2.2) in terms of the rate of energy and entropy change of an open system, which is subjected to work output, heat interactions, and multiple inlets and outlets of steady flow. 2.4. Discuss the remarks of Rudolf Clausius in 1867: (a) The energy of the universe is constant. (b) The entropy of the universe strives to attain a maximum value. 2.1.

For a cyclic device experiencing heat interactions with reservoirs at T1,T2, . . ., the Clausius inequality can be expressed as a dQi/Ti 0 or D dQ/T 0, regardless of whether the device pro

2.5.

i

duces or consumes work. Note that ␦Q is positive when heat is received by the device. Prove the Clausius inequality by applying the second law to a closed system. 2.6. In the stable-equilibrium states, the energy and the entropy of a solid are related by E 3 105 exp[(S S0)/1000], where E is in J, S is in J/K, and S0 is the entropy of the solid at a reference temperature of 300 K. Plot this relation in an E-S graph. Find expressions for E and S in terms of its temperature T and S0. 2.7. For an isolated system, give the mathematical expressions of the first and second laws of thermodynamics. Give graphic illustrations using E-S graph. 2.8. Place two identical metal blocks A and B, initially at different temperatures, in contact with each other but without interactions with any other systems. A thermal equilibrium is reached quickly. System C represents the combined system of both A and B. (a) Is the process reversible or not? Which system has experienced a spontaneous change of state? Which systems have experienced an induced change of state? (b) Assume that the specific heat of the metal is independent of temperature, cp 240 J/(kg K), the initial temperatures are TA1 800 K and TB1 200 K, and the mass of each block is 5 kg. What is the final temperature? What is the total entropy generation in this process? (c) Show the initial and final states of systems A, B, and C in a u-s diagram, and indicate which state is not an equilibrium state. Determine the adiabatic availability of system C in the initial state. 2.9. Two blocks made of the same material with the same mass are allowed to interact with each other but isolated from the surroundings. Initially, block A is at 800 K and block B at 200 K. Assuming that the specific heat is independent of temperature, show that the final equilibrium temperature is 500 K. Determine the maximum and minimum entropies that may be transferred from block A to block B. 2.10. A cyclic machine receives 325 kJ heat from a 1000 K reservoir and rejects 125 kJ heat to a 400 K reservoir in a cycle that produces 200 kJ work. Is this cycle reversible, irreversible, or impossible? 2.11. If z z(x,y), then dz fdx gdy, where f(x,y) z/x, g(x,y) z/y. Therefore, 'f 'g '2z '2z 'y 'y'x 'x'y 'x

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

53

The second-order derivatives of the fundamental equation and each of the characteristic function yield a Maxwell relation. Maxwell’s relations are very useful for evaluating the properties of a system in the stable-equilibrium states. For a closed system without chemical reactions, we have dNi ; 0. Show that Q

'V 'S 'V 'S 'T 'P 'P 'T R Q R V, Q R S Q R P, Q R T Q R V, and Q R T Q R P 'V S 'P 'V 'T 'P 'T 'S 'S

2.12. The isobaric volume expansion coefficient is defined as bP (1/v)('v/'T)P, the isothermal

compressibility is kT (1/v)('v/'P)T, and the speed of sound is va !('P/'r)s. For an ideal gas, show that P 1/T, T 1/P, and va !gRT .

2.13. For a system with single type of constituents, the fundamental relation obtained by experiments

gives S a(NVU)1/3, where ␣ is a positive constant, and N, V, S, and U are the number of molecules, the volume, the entropy, and the internal energy of the system, respectively. Obtain expressions of the temperature and the pressure in terms of N, V, U, and ␣. Show that S 0 at zero temperature for constant N and V. 2.14. For blackbody radiation in an evacuated enclosure of uniform wall temperature T, the energy density can be expressed as uv U/V (4/c)sSBT 4, where U is the internal energy, V the volume, c the speed of light, and SB the Stefan-Boltzmann constant. Determine the entropy S(T, V) and the pressure P(T, V ), which is called the radiation pressure. Show that the radiation pressure is a function of temperature only and negligibly small at moderate temperatures. Hint: T

1 'U S 3 Q R dT 0 T 'T V

and P T Q

'U 'S R Q R 'V T 'V T

2.15. A cyclic machine can only interact with two reservoirs at temperatures TA 298 K and TB

77.3 K, respectively. . (a) If heat is extracted from. reservoir A at a rate of Q 1000 W, what is the maximum rate of work that can be generated (Wmax)? (b) If no work is produced, what is the rate of entropy generation (S˙gen) of the cyclic machine? (c) Plot S˙gen versus W˙ (the power produced). 2.16. An engineer claimed that it requires much more work to remove 0.1 J of heat from a cryogenic chamber at an absolute temperature of 0.1 K than to remove 270 J of heat from a refrigerator at 270 K. Assuming that the environment is at 300 K, justify this claim by calculating the minimum work required for each refrigeration task. 2.17. A solid block [m 10 kg and cp 0.5 kJ/(kg K)], initially at room temperature (TA,1 300 K) is cooled with a large tank of liquid-gas mixture of nitrogen at TB 77.3 K and atmospheric pressure. (a) After the block reaches the liquid-nitrogen temperature, what is the total entropy generation (Sgen)? (b) Given the specific enthalpy of evaporation of nitrogen, hfg 198.8 kJ/kg, what must be its specific entropy of evaporation sfg in kJ/(kg K), in order for the nitrogen tank to be modeled as a reservoir? Does hfg Tsat sfg always hold? 2.18. Two same-size solid blocks of the same material are isolated from other systems [specific heat cp 2 kJ/(kg K); mass m 5 kg]. Initially block A is at a temperature TA1 300 K and block B at TB1 1000 K. (a) If the two blocks are put together, what will be the equilibrium temperature (T2) and how much entropy will be generated (Sgen)? (b) If the two blocks are connected with a cyclic machine, what is the maximum work that can be obtained (Wmax)? What would be the final temperature of the blocks (T3) if the maximum work were obtained? 2.19. A rock [density 2800 kg/m3 and specific heat cp 900 J/(kg K)] of 0.8 m3 is heated to 500 K using solar energy. A heat engine (cyclic machine) receives heat from the rock and rejects heat to the ambient at 290 K. The rock therefore cools down. (a) Find the maximum energy (heat) that the rock can give out. (b) Find the maximum work that can be done by the heat engine, Wmax. (c) In an actual process, the final temperature of the rock is 330 K and the work output from the engine is only half of Wmax. Determine the entropy generation of the actual process.

54

CHAPTER 2

2.20. Consider three identical solid blocks with a mass of 5 kg each, initially at 300, 600, and 900 K,

respectively. The specific heat of the material is cp 2000 J/(kg K). A cyclic machine is available that can interact only with the three blocks. (a) What is the maximum work that can be produced? What are the final temperatures of each block? Is the final state in equilibrium? (b) If no work is produced, i.e., simply putting the three blocks together, what will be the maximum entropy generation? What will be the final temperature? (c) If the three blocks are allowed to interact via cyclic machine but not with any other systems in the environment, what is the highest temperature that can be reached by one of the blocks? (d) If the three blocks are allowed to interact via cyclic machine but not with any other systems in the environment, what is the lowest temperature that can be reached by one of the blocks? 2.21. Electrical power is used to raise the temperature of a 500 kg rock from 25 to 500C. The specific heat of the rock material is cp 0.85 kJ/(kg K). (a) If the rock is heated directly through resistive (Joule) heating, how much electrical energy is needed? Is this process reversible? If not, how much entropy is generated in this process? (b) By using cyclic devices that can interact with both the rock and the environment at 25C, what is the minimum electrical energy required? 2.22. An insulated cylinder of 2 m3 is divided into two parts of equal volume by an initially locked piston. Side A contains air at 300 K and 200 kPa; side B contains air at 1500 K and 1 MPa. The piston is now unlocked so that it is free to move and it conducts heat. An equilibrium state is reached between the two sides after a while. (a) Find the masses in both A and B. (b) Find the final temperatures, pressures, and volumes for both A and B. (c) Find the entropy generation in this process. 2.23. A piston-cylinder contains 0.56 kg of N2 gas, initially at 600 K. A cyclic machine receives heat from the cylinder and releases heat to the environment at 300 K. Assume that the specific heat of N2 is cp 1.06 kJ/(kg K) and the pressure inside the cylinder is maintained at 100 kPa by the environment. What is the maximum work that can be produced by the machine? What is the thermal efficiency (defined as the ratio of the work output to the heat received)? The thermodynamic efficiency can be defined as the ratio of the actual work produced to the maximum work. Plot the thermodynamic efficiency as a function of the entropy generation. What is the maximum entropy generation? 2.24. An air stream [cp 1 kJ/(kg K) and M 29.1 kg/kmol] flows through a power plant. The stream enters a turbine at T1 750 K and P1 6 MPa, and exits at P2 1.2 MPa into a recovery unit, which can exchange heat with the environment at 25C and 100 kPa. The stream then exits the recovery unit to the environment. The turbine is thermally insulated and has an efficiency t 0.85. (a) Find the power per unit mass flow rate produced by the turbine. (b) Calculate the entropy generation rate in the turbine. (c) Determine the largest power that can be produced by the recovery unit. 2.25. Water flows in a perfectly insulated, steady state, horizontal duct of variable cross-sectional area. Measurements were taken at two ports and the data were recorded in a notebook as follows. For port 1, speed j1 3 m/s, pressure P1 50 kPa, and temperature T1 40C; for port 2, j2 5 m/s and P2 45 kPa. Some information was accidentally left out by the student taking the notes. Can you determine T2 and the direction of the flow based on the available information? Hint: Model the water as an ideal incompressible liquid with cp 4.2 kJ/(kg K) and specific volume v 103 m3/kg. 2.26. An insulated rigid vessel contains 0.4 kmol of oxygen at 200 kPa separated by a membrane from 0.6 kmol of carbon dioxide at 400 kPa; both sides are initially at 300 K. The membrane is suddenly broken and, after a while, the mixture comes to a uniform state (equilibrium). (a) Find the final temperature and pressure of the mixture. (b) Determine the entropy generation due to irreversibility. 2.27. Pure N2 and air (21% O2 and 79% N2 by volume), both at 298 K and 120 kPa, enter a chamber at a flow rate of 0.1 and 0.3 kmol/s, respectively. The new mixture leaves the chamber at the same temperature and pressure as the incoming streams. (a) What are the mole fractions and the mass fractions of N2 and O2 at the exit? (b) Find the enthalpy change in the mixing process. Find the entropy generation rate of the mixing process. (c) Consider a process in which the flow directions are reversed. The chamber now contains necessary devices for the separation, and it may transfer heat to the environment at 298 K. What is the minimum amount of work per unit time needed to operate the separation devices?

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

55

2.28. A Carnot engine receives energy from a reservoir at TH and rejects heat to the environment at

T0 via a heat exchanger. The engine works reversibly between TH and TL, where TL is the temperature of the higher-temperature side of the heat exchanger. The product of the area and the heat transfer coefficient of the heat exchanger is ␣. Therefore, the heat that must be rejected to the environment through the . heat exchanger is QL a(TL T0). Given TH 800 K, T0 300 K, and a 2300 W/K. Determine the value of TL so that the heat engine will produce maximum work, and calculate the power production and the entropy generation in such a case. 2.29. To measure the thermal conductivity, a thin-film electric heater is sandwiched between two plates whose sides are well insulated. Each plate has an area of 0.1 m2 and a thickness of 0.05 m. The outside of the plates are exposed to air at T` 25C with a convection coefficient of h 40 W/(m2 K). The electric power of the heat is 400 W and a thermocouple inserted between the two plates measures a temperature of T1 175C at steady state. Determine the thermal conductivity of the plate material. Find the total entropy generation rate. Comment on the fraction of entropy generation due to conduction and convection. 2.30. An electric current, I 2 A, passes through a resistive wire of diameter D 3 mm with a resistivity re 1.5 104 # m. The cable is placed in ambient air at 27C with a convection coefficient h 20 W/(m2 K). Assume a steady state has been reached and neglect radiation. Determine the radial temperature distribution inside the wire. Determine the volumetric entropy generation rate s˙gen as a function of radius. Determine the total entropy generation rate per unit length of the cable. Hint: For . steady-state conduction, sgen (1/T )= # qs (1/T 2) (qs # =T ). 2.31. Find the thermal conductivity of intrinsic (undoped) silicon, heavily doped silicon, quartz, glass, diamond, graphite, and carbon from 100 to 1000 K from Touloukian and Ho.13 Discuss the variations between different materials, crystalline structures, and doping concentrations. 2.32. Find the thermal conductivity of copper from 1 to 1000 K from Touloukian and Ho.13 Discuss the general trend in terms of temperature dependence, and comment on the effect of impurities. 2.33. For laminar flow over a flat plate, the velocity and thermal boundary layer thicknesses can be Pr1/3, respectively. Use room temperature data to calculated by d(x) 5xRe1/2 and dt(x) 5xRe1/2 x x calculate and plot the boundary layer thicknesses for air, water, engine oil, and mercury for different values of U` . Discuss the main features. Hint: Property data can be found from Incropera and DeWitt.11 2.34. Air at 14C and atmospheric pressure is in parallel flow over a flat plate of 2 2 m2. The air velocity is 3 m/s and the surface is maintained at 140C. Determine the average convection coefficient and the rate of heat transfer from the plate to air. (For air at 350 K, which is the average temperature between the surface and fluid, k 0.03 W/(m # K), n 20.9 106 m2/s, and Pr 0.7.) 2.35. Plot the blackbody intensity (Planck’s law) as a function of wavelength for several temperatures. Discuss the main features of this function. Show that in the long-wavelength limit, the blackbody function can be approximated by eb,l(l,T) < pckBT>l4, which is the Rayleigh-Jeans formula. 2.36. Calculate the net radiative heat flux from the human body at a surface temperature of Ts 308 K , with an emissivity e 0.9, to the room walls at 298 K. Assume air at 298 K has a natural convection coefficient of 5 W/(m2 K). Neglect evaporation, calculate the natural convection heat flux from the person to air. Comment on the significance of thermal radiation. 2.37. A combustion fired in a spherical enclosure of diameter D 50 cm with a constant wall temperature of 600 K. The temperature of the combustion gas may be approximated as uniform at 2300 K. The absorption coefficient of the gas al 0.01 cm1, which is independent of wavelength. Assuming that the wall is black and neglecting the scattering effect, determine the net heat transfer rate between the gas and the inner wall of the sphere.

This page intentionally left blank

CHAPTER 3

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

Classical statistical mechanics is based on the assumption that all matters are composed of a myriad of small discrete particles, such as molecules and atoms, in any given macroscopic volume.1–5 There are about N 2.5 1016 molecules per cubic millimeter of air at standard conditions (25°C and 1 atm). These particles are in continuous random motion, which generally obeys the laws of classical mechanics. A complete microscopic description of a system requires the identification of the position ri(t) and velocity vi(t) of each particle (here, subscript i indicates the ith particle) at any time. For a simple system of N molecules in a box of volume V, one can write Newton’s law of motion for each molecule as dvi a Fij(ri,rj,t) mi dt , i 1, 2, c, N j

(3.1)

where Fij is the intermolecular force that the jth molecule exerts on the ith molecule, and mi is the mass of the ith molecule. The initial position and velocity, as well as the nature of collisions among particles and that between particles and the walls of the box, must be specified in order to solve the N equations. Although this approach is straightforward, there are two major barriers. First, the intermolecular forces or potentials are often complicated and difficult to determine. Second, the solution of Eq. (3.1) requires significant computer resources even for rather simple problems. Statistical methods are often used instead to obtain microscopic descriptions that are related to macroscopic behaviors. Statistical mechanics aims at finding the equilibrium distribution of certain types of particles in the velocity space. It provides a linkage between macroscopic thermodynamic properties and the microscopic behavior and a means to evaluate some thermodynamic properties. Kinetic theory, on the other hand, deals with nonequilibrium processes. It gives a microscopic description of transport phenomena and helps predict some important transport properties, as will be seen in Chap. 4. Along with the rapid development in computing speed and memory, molecular dynamics (MD) simulation has become a powerful tool for the investigation of phenomena occurring in nanostructures and/or at very short time scales. In the MD method, the location and the velocity of every particle are calculated at each time step by applying Eq. (3.1) with a suitable potential function.6,7 Thermodynamic properties are then evaluated using statistical mechanics formulation. Further discussion about the application of MD simulation to predict the thermal properties of nanostructures will be given in Chap. 7. This chapter starts with a statistical model of independent particles and a brief introduction to the basic principles of quantum mechanics. The necessary mathematical background is summarized in Appendix B. It is highly recommended that one review the materials covered in the appendix before studying this chapter. The three important distributions are derived 57 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

58

CHAPTER 3

based on the statistics for different types of particles. The microscopic descriptions and results are then linked to macroscopic quantities and the laws of thermodynamics. The application to ideal gases is presented at the end of this chapter, while discussions of blackbody radiation, lattice vibrations, and free electron gas will be deferred to later chapters.

3.1 STATISTICAL MECHANICS OF INDEPENDENT PARTICLES We say particles are independent when their energies are independent of each other and the total energy is the sum of the energies of individual particles. Consider a system that has N independent particles of the same type confined in a volume V. The total internal energy of the system is U, which is the sum of the energies of all particles. Particles may have different energies and can be grouped according to their energies. It is of interest to know how many particles are there within certain energy intervals. We can subdivide energy into a large number of discretized energy levels. As illustrated in Fig. 3.1, there are Ni particles on the ith energy level, each with energy exactly equal to ei.

FIGURE 3.1 Illustration of (a) a simple system of independent particles and (b) energy levels.

From the classical mechanics point of view, it appears that the increment between adjacent energy levels can be indefinitely small. The particles are distinguishable, and there is no limit on the number of particles on each energy level. Quantum mechanics predicts that the energy levels are indeed discretized with finite increments between adjacent energy levels, and the particles are unidentifiable (indistinguishable). An introduction to the basic principles of quantum mechanics is given in Sec. 3.1.3 and a more detailed introduction of the quantum theory is given near the end of this chapter. The conservation equations for the system shown in Fig. 3.1 are a Ni N

(3.2)

a eiNi U

(3.3)

`

i0 `

and

i0

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

59

3.1.1 Macrostates versus Microstates The thermodynamic state may be viewed in terms of the gross behavior that ignores any differences at the molecular or atomic level, or in terms of the individual particles. A macrostate is determined by the values of N0, N1, N2, c for a given volume (which somehow confines the quantized energy levels) though two different macrostates can have the same energy. Each macrostate may be made up of a number of microscopic arrangements; each microscopic arrangement is called a microstate. In statistical mechanics, all microstates are assumed equally probable. There may be a large number of microstates that correspond to the same macrostate. The number of microstates for each macrostate is termed the thermodynamic probability of that macrostate. Unlike the stochastic probability that lies between 0 and 1, the thermodynamic probability is usually a very large number. One of the principles underlying statistical mechanics is that the stable-equilibrium state corresponds to the most probable macrostate. Therefore, for given values of U, N, and V, the thermodynamic probability is the largest in the stable-equilibrium state. We will use the following example to illustrate the concepts of microstate and macrostate. There are four distinguishable particles in a confined space, and there are two energy levels. How many macrostates are there? How many microstates are there for the macrostate with two particles on each energy level? Example 3-1.

There are five macrostates in total with (N1, N2) (0, 4), (1, 3), (2, 2), (3, 1), and (4, 0), respectively. Because the particles are distinguishable, the microstates will be different only if the particles from different energy levels are interchanged. Using the combination theory, we can figure out that (N1,N2) N!/(N1! N2!) 4!/(2!2!) 6, i.e., there are six microstates for the macrostate with two particles on each energy level. It can be shown that this is also the most probable macrostate.

Solution.

3.1.2 Phase Space The phase space is a six-dimensional space formed by three coordinates for the position r and three coordinates for the momentum p mv or velocity v. Each point in the phase space defines the exact location and momentum of an individual particle. If both the space and the momentum are described with the Cartesian system, then a volume element in the phase space is dxdydzdpxdpydpz. Figure 3.2 shows a phase space projected to the x-px plane.

FIGURE 3.2 Phase space projected to the x-p x plane, where

x px is an area element.

60

CHAPTER 3

The three coordinates (px,py,pz) form a momentum space. One may choose to use (vx,vy,vz) to form a velocity space. If the momentum space is described in spherical coordinates, the volume element is dpxdpydpz p2 sin dpdd. The volume contained in a spherical shell from p to p dp is 4pp2dp. Figure 3.3 illustrates the momentum space projected to the px-py plane, with a spherical shell.

FIGURE 3.3 The p x-p y plane of the momentum space, showing a spherical shell.

3.1.3 Quantum Mechanics Considerations The origin of quantum theory can be traced back to about 100 years ago when Planck first used a discrete set of energies to describe the electromagnetic radiation, and thus obtained Planck’s distribution (details to be presented in Sec. 8.1). For any given frequency of radiation n, the smallest energy increment is given by hn, where h 6.626 1034 J # s is called Planck’s constant. Radiation can be alternatively viewed as electromagnetic waves or traveling energy quanta. The corpuscular theory treats radiation as a collection of energy quanta, called photons. The energy of a photon is given by e hn

(3.4)

From the wave theory, the speed of light c is related to the wavelength l and the frequency by c ln

(3.5)

In a medium with a refractive index of n, c c0 /n and l l0 /n, where subscript 0 is used to indicate quantities in vacuum with n 1. The speed of light in vacuum is c0 299, 792, 458 m/s, which is a defined quantity as given in Appendix A. Note that frequency does not change from one medium to another. Based on the relativistic theory, the rest energy E0 of a particle with mass m is E0 mc2

(3.6)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

61

The momentum of the particle traveling with speed v is p mv. Since the energy of a photon is hn and its speed is c, the momentum of a (massless) photon is (see Sec. 3.7) hn h p c l

(3.7)

Another hypothesis of quantum theory is that the motion of matter may be wavelike, with characteristic wavelength and frequency. Therefore, for a particle moving with velocity v V c, h h lDB p mv

and

nDB

mc2 h

(3.8)

which are called de Broglie wavelength and de Broglie frequency, respectively. In 1923, Louis de Broglie postulated that matter may also possess wave characteristics and thereafter resolved the controversy as per the nature of radiation. Note that the phase speed of the wave defined by Eq. (3.8) is c2/v, which is greater than the speed of light. The discovery of electron diffraction confirmed de Broglie’s hypothesis. For this prediction, de Broglie received the Nobel Prize in physics in 1929. Seven years later, the 1937 Nobel Prize in physics was shared by Clinton J. Davisson and George P. Thomson for their independent experiments that demonstrated diffraction of electrons by crystals. Example 3-2. Calculate the frequency in Hz and photon energy in eV of an ultraviolet (UV) laser

beam at a wavelength of l 248 nm and a microwave at l 10 cm. Calculate the de Broglie wavelength of an He atom at 200C, using the average speed of 1717 m/s, and an electron traveling with a speed of 106 m/s. Solution. The equations are c/ and hc. Assume the refractive index is 1. For the UV

beam at 248 nm, 1.2 1015 Hz and 8.01 1019 J 5 eV. For 10 cm, 3 109 Hz 3 GHz and 2 1024 J 1.24 105 eV 124 meV. The mass of an He atom is m M/NA 6.64 1027 kg. Hence, DB h/mv 5.8 1011 m 58 pm. From Appendix A, me 9.11 1031 kg, therefore, lDB 7.3 1010 m 0.73 nm, which is in the x-ray region.

The foundation of quantum mechanics is the Schrödinger equation, which is a partialdifferential equation of the time-space dependent complex probability density function. More details can be found from Tien and Lienhard,1 Carey,5 and Griffiths.8 The solutions of the Schrödinger equation support the dual nature of wave and matter, and result in discrete quantized energy levels. Furthermore, there are usually more than one distinguishable quantum state at each energy level, i.e., the energy levels may be degenerate. The number of quantum states for a given energy level is called the degeneracy, denoted by gi for the ith energy level, as shown in Fig. 3.4.

FIGURE 3.4 The degeneracy of the ith energy level.

The uncertainty principle states that the position and momentum of a given particle cannot be measured simultaneously with arbitrary precision. The limit is given by

x px h/4p

(3.9)

CHAPTER 3

62

This result implies that we cannot locate the exact position of a particle in the phase space; all we can say is that the particle is somewhere in a domain whose volume is around h3. The uncertainty principle is one of the cornerstones of quantum mechanics and was formulated in 1927 by Werner Heisenberg, a Nobel Laureate in Physics. In quantum theory, independent particles of the same type are indistinguishable. For certain particles, such as electrons, each quantum state cannot be occupied by more than one particle. This is the Pauli exclusion principle, discovered by Nobel Laureate Wolfgang Pauli in 1925. The result, as we will see, is the Fermi-Dirac statistics that can be used to describe the behavior of free electrons. The collection of free electrons in metals is sometimes called the free electron gas, which exhibits very different characteristics from ideal molecular gases.

3.1.4 Equilibrium Distributions for Different Statistics The characteristics of various types of particles can be described by different statistics. In this section, we will first introduce three statistics and then apply them to obtain the distribution functions, i.e., the number of particles on each energy level. The application of the distribution functions to the study of thermodynamic properties of ideal molecular gases will be discussed later in this chapter. The applications of statistical thermodynamics to blackbody radiation, lattice vibration, free electrons in metals, and electrons and holes in semiconductors will be discussed in subsequent chapters. • The Maxwell-Boltzmann (MB) statistics: Particles are distinguishable and there is no limit for the number of particles on each energy level. From Eq. (B.22) in Appendix B, the thermodynamic probability for the distribution shown in Fig. 3.1b is

N! N0!N1!N2! c

N! q Ni! `

i0

If degeneracy is included as shown in Fig. 3.4, then gNi i N! i0 i

MB N! q `

(3.10)

• The Bose-Einstein (BE) statistics: Particles are indistinguishable and there is no limit for the number of particles in each quantum state; there are gi quantum states on the ith energy level. From Eq. (B.23), the number of ways of placing Ni indistinguishable objects (gi Ni 1)! to gi distinguishable boxes is . Therefore, the thermodynamic probability (gi 1)!Ni! for BE statistics is ` (g N 1)! i i BE q (gi 1)!Ni! i0

(3.11)

• The Fermi-Dirac (FD) statistics: Particles are indistinguishable and the energy levels are degenerate. There are gi quantum states on the ith energy level, and each quantum state can be occupied by no more than one particle. Using Eq. (B.21), we obtain the thermodynamic probability for FD statistics as ` gi! FD q (g Ni)!Ni! i i0

(3.12)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

63

The three statistics are very important for understanding the molecular, electronic, crystalline, and radiative behaviors that are essential for energy transport processes in both small and large scales. MB statistics can be considered as the limiting case of BE or FD statistics. The thermodynamic relations and the velocity distribution of ideal molecular gases can be understood from MB statistics. BE statistics is important for the study of photons, phonons in solids, and atoms at low temperatures. It is the basis of Planck’s law of blackbody radiation, the Debye theory for the specific heat of solids, and the Bose-Einstein condensation, which is important for superconductivity, superfluidity, and laser cooling of atoms. FD statistics can be used to model the electron gas and the electron contribution to the specific heat of solids. It is important for understanding the electronic and thermal properties of metals and semiconductors. Example 3-3. Four indistinguishable particles are to be placed in two energy levels, each with a degeneracy of 3. Evaluate the thermodynamic probability of all arrangements, considering BE and FD statistics separately. What are the most probable arrangements?

There are two energy levels, g0 g1 3 and the total number of particles N 4. The thermodynamic probability is 0 1, which depends on N0 and N1 (N0 N1 4). Figure 3.5 shows specific cases of the BE and FD distributions.

Solution.

FIGURE 3.5 Illustration of the arrangement for four particles on two energy levels, each with a degeneracy of 3. (a) Bose-Einstein statistics. (b) Fermi-Dirac statistics.

For BE statistics, we have BE

(N0 g0 1)! (N1 g1 1)! (g0 1)!N0! (g1 1)! N1! (N0 2)(N0 1) (6 N0)(5 N0) 2 2

For FD statistics, we must have Ni gi; therefore, 1 N0 3, and FD

g0! g 1! 6 6 (g0 N0)! N0! (g1 N1)! N1! (3 N0)! N0! (N0 1)! (4 N0)!

The results are summarized in the following table. Clearly, the most probable arrangement for both statistics in this case is N0 N1 2. N0 N1 BE FD

0 4 15 –

1 3 30 3

2 2 36 9

3 1 30 3

4 0 15 –

For a given simple thermodynamics system of volume V, internal energy U, and total number of particles N, we wish to find the state (identified by the distribution N0,N1,N2, c) that maximizes or ln , under the constrains given by Eq. (3.2) and Eq. (3.3), based on

CHAPTER 3

64

the method of Lagrange multipliers (Appendix B). For MB statistics with degeneracy, from Eq. (3.10), ln ln N! a Ni ln gi a ln Ni! `

`

i0

i0

For a large number of particles, the Stirling formula gives ln N! < N ln N N from Eq. (B.11). The above equation can be approximated as ln < N ln N N a Ni ln gi a (Ni ln Ni Ni) `

`

i0

i0

gi N ln N N a Ni aln 1b N `

i

i0

Notice that N and gi’s are fixed and only Ni’s are variables, therefore, ` ` ` gi gi '(ln ) 1 dNi < a aln 1 Ni b dNi a ln dNi 0 (3.13) d(ln ) a 'N N N N i i i i i0 i0 i0

From the constraint equations, Eq. (3.2) and Eq. (3.3), we have a a dNi 0

(3.14a)

b a eidNi 0

(3.14b)

`

i0 `

and

i0

where ␣ and  are Lagrangian multipliers and ei’s are treated as constants. Conventionally, negative signs are chosen because a and b are generally nonnegative for molecular gases. By adding Eq. (3.14a) and Eq. (3.14b) to Eq. (3.13), we obtain ` gi a (ln N a bei)dNi 0

i0

i

Because dNi can be arbitrary, the above equation requires that ln (gi/Ni) a bei 0. Hence, Ni

or

gi eaebei

gieaebei

Ni N

a be a gie e i `

(3.15a)

(3.15b)

i0

This is the MB distribution. The physical meanings of a and b will be discussed later. Using the same procedure described above, we can obtain the following for BE statistics, Ni

gi ee

a bei

1

(3.16)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

65

which is the BE distribution. For FD statistics, we can obtain the FD distribution as follows Ni

gi

(3.17)

eaebei 1

The results for all the three statistics are summarized in Table 3.1. Derive the BE distribution step by step. Under which condition can it be approximated by the MB distribution?

Example 3-4. Solution.

Using the thermodynamic probability of BE statistics in Eq. (3.11), we have ln a [ln(gi Ni 1)! ln(gi 1)! ln Ni!] `

i0

< a [(gi Ni 1) ln(gi Ni 1) (gi Ni 1) `

i0

(gi 1) ln(gi 1) (gi 1) Ni ln Ni Ni] a [(gi Ni 1) ln(gi Ni 1) (gi 1) ln(gi 1) Ni ln Ni] `

i0

Hence,

' ln 1 1 ln Ni Ni < ln (gi Ni 1) (gi Ni 1) 'Ni Ni gi Ni 1 ln a

gi Ni 1 gi b < ln a 1b , Ni Ni

since

Ni 1

To maximize , we set d(ln ) 0, i.e., ` ` '(ln ) gi dNi < a ln a 1bdNi 0 d(ln ) a 'N N i i i0 i0

By adding Lagrangian multipliers, Eq. (3.14a) and Eq. (3.14b), we have a [ln (gi /Ni 1) ␣ i] `

i0

dNi 0. Hence, Ni gi /(eaebei 1), which is the BE distribution given in Eq. (3.16) and Table 3.1. If exp(␣ i) W 1, Eq. (3.16) and Eq. (3.17) reduce to the MB distribution, Eq. (3.15a). Under the limiting case of gi W Ni W 1, we have Ni terms 7

(gi Ni 1)! (gi Ni 1) c (gi 1)gi gi Ni 1 gNi i h (gi 1)!Ni! Ni! Ni! Ni terms 7

and

gi! gi(gi 1) c (gi Ni 1) gi Ni 1 gNi i h N! (gi Ni)!Ni! Ni! i

That is to say that the thermodynamic probability for both the BE and FD statistics reduces to the MB statistics divided by N!, which is caused by the assumption of indistinguishable particles. Therefore, ` gNi MB i MB,corrected q N ! N! i0 i

(3.18)

TABLE 3.1 Summary of the Three Statistics Statistics

Maxwell-Boltzmann (MB)

Bose-Einstein (BE)

Fermi-Dirac (FD)

Name of particles Examples

Boltzons Ideal gas molecules & in the limit of bosons and fermions Distinguishable Degenerate Unlimited ` g Ni i N! q N i! i0

Bosons Photons & phonons

Fermions Electrons & protons

Indistinguishable Degenerate Unlimited ` (g N 1)! i i q (g 1)!N ! i i i0

Indistinguishable Degenerate One ` gi! q (g N )! N ! i i i i0

Distinguishability Degeneracy Particles per quantum state Thermodynamic probability In the limit of gi W Ni

MB (given above)

MB/N!

MB/N!

N ln N N

a [(gi Ni 1) ln (gi Ni 1)

a [(gi ln gi Ni ln Ni

`

66

ln

i0

a Ni[ln(gi/Ni) 1]

`

i0

`

i0 `

gi a lna N bdNi

d(ln )

i

i0

a a dNi b a eidNi `

`

i0

i0

Distribution function Ni Applications

gi lna b a bei 0 Ni

Ni ln Ni (gi 1) ln(gi 1)]

(gi Ni) ln (gi Ni)]

gi a lna N 1bdNi

` gi a lna N 1bdNi

`

i0

i

gi lna 1b a bei 0 Ni

i0

i

gi lna 1b a bei 0 Ni

gi

gi

gi

eaebei

eaebei 1

eaebei 1

Ideal gases; Maxwell’s velocity distribution; limiting cases of BE and FD statistics

Planck’s law; Bose-Einstein condensation; specific heat of solids

Electron gas; Fermi level; electron specific heat in metals

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

67

is called the “corrected” MB statistics. For ideal molecular gases at reasonably high temperatures, gi W Ni. For this reason, the MB distribution may be considered as the limiting case of the BE or FD distribution (see Table 3.1).

3.2 THERMODYNAMIC RELATIONS The thermodynamic properties and relations can be understood from the microscopic point of view. This includes the concept of heat and work, entropy, and the third law of thermodynamics. The partition function is key to the evaluation of thermodynamic properties.

3.2.1 Heat and Work From Eq. (3.3), we have dU a eidNi a Nidei `

`

i0

i0

(3.19a)

The first term on the right is due to a redistribution of particles among the energy levels (which is related to a change in entropy), while the second is due to a shift in the energy levels associated with, e.g., a volume change. Consider a reversible quasi-equilibrium process for a closed system (such as a piston/cylinder arrangement). The work is associated to the volume change that does not change the entropy of the system, while heat transfer changes entropy of the system without affecting the energy levels. Therefore, dQ a eidNi and dW a Nidei `

`

i0

i0

(3.19b)

In writing the above equation, dQ is positive for heat transferred to the system, and ␦W is positive for work done by the system. They are related to macroscopic quantities for simple systems by dQ TdS and dW PdV . Hence, we obtain the expression of the first law for a closed system, dU dQ dW . If the system is an open system, then a eidNi dU dW 2 dQ. `

i0

3.2.2 Entropy The macroscopic property entropy is related to the thermodynamic probability by S kBln

(3.20)

where kB is the Boltzmann constant. Consider two separate systems A and B, and their combination as a system C. At a certain time, both A and B are individually in thermodynamic equilibrium. Denote the states as A1 and B1, and the combined system as state C1. The thermodynamic probability of system C at state C1 is related to those of A1 and B1 by C1 A1 B1

CHAPTER 3

68

The entropy of C1 is then SC1 kB ln C1 kB ln(A1 B1) kB ln A1 kB ln B1 SA1 SB1 Therefore, this definition of entropy meets the additive requirement. The highest entropy principle states that the entropy of an isolated system will increase until it reaches a stable-equilibrium state (thermodynamic equilibrium), i.e., Sisolated 0. The microscopic understanding is that entropy is related to the probability of occurrence of a certain macrostate. For a system with specified U, N, and V, the macrostate that corresponds to the thermodynamic equilibrium is the most probable state and, hence, its entropy is the largest. Any states, including those that deviate very slightly from the stable-equilibrium state, will have a much smaller thermodynamic probability. After the equilibrium state is reached, it is not possible for any macrostate, whose thermodynamic probability is much less than that of the equilibrium state, to occur within an observable amount of time.

3.2.3 The Lagrangian Multipliers For all three types of statistics, d(ln ) a a dNi b a eidNi, where the first term `

`

i0

i0

is the change in the total number of particles and the second can be related to the net heat transfer for a closed system; therefore, d(ln ) adN bdQ. In a reversible process in which the total number of particles do not change (closed system), dN 0, d(ln ) dS/kB, and dQ TdS. Hence, we have for all three statistics b;

1 kBT

(3.21)

To evaluate ␣, we must allow the system to change its composition. In this case, d(ln ) a a dNi b a eidNi adN b(dU PdV) or

`

`

i0

i0

TdS kBTadN dU PdV

Substituting the above equation into the definition of the Helmholtz function, dA d(U TS) dU TdS SdT, we have dA SdT PdV kBTadN Noting that the chemical potential m ('A>'N)T,V kBTa, we obtain a

m kBT

(3.22)

where is expressed in molecular quantity, and a m/RT if m is expressed in molar quantity. 3.2.4 Entropy at Absolute Zero Temperature The third law of thermodynamics states that the entropy of any pure substance vanishes at the ground state (with absolute zero temperature); see Sec. 2.1.3. For BE statistics, we have g0 g1 g2 N N0 N1 N2 c 5 a be

a be

a be

c 0 1 1 1 2 1 e e e

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

69

At very low temperatures (T S 0), b 1/kBT S ` . Since e0 e1 e2 c, Ni gi < g eb(eie0) S 0 as T S 0 for i 1 N0 0

(3.23)

Hence, N < N0; that is, all particles will be at the lowest energy level (ground state). If g0 1, as it is the case for a pure substance, then 1 and S kB ln 0 as T S 0; this is consistent with the third law of thermodynamics. The occurrence for particles that obey BE statistics (bosons) to collapse to the ground state at sufficiently low temperatures is called the Bose-Einstein condensation. Such a state of matter is called the Bose-Einstein condensate, in which quantum effects dominate the macroscopic behavior. Some important applications of the Bose-Einstein condensation are superfluidity and superconductivity. Liquid helium (4He) becomes a superfluid with no viscosity at temperatures below the -transition (T < 2.17 K). The specific heat of helium at this temperature becomes infinitely large, suggesting that a phase transition occurs. Bose-Einstein condensate of atoms has been observed with laser cooling and trapping techniques.9 Photons from the laser collide with the atoms. The absorption can be tuned using the Doppler shift so that only atoms traveling toward the laser can absorb the photons, resulting in reduced momentums in these atoms. Furthermore, the excited atoms will emit photons spontaneously in all directions. The net effect is a decrease in the velocity of the atoms, resulting in a kinetic temperature down to the nanokelvin range. In the last decade, the Nobel Prize in Physics was awarded for works related to the Bose-Einstein condensation four times: 1996, 1997, 2001, and 2003. Although electrons are fermions (particles that obey FD statistics) that generally do not condense at zero temperature, they can form pairs at sufficiently low temperatures that behave like bosons. Below the critical temperature, pairs of electrons, called the Cooper pairs can travel freely without any resistance. This is the phenomenon called superconductivity, which was discovered at the beginning of the twentieth century. A large number of elements and compounds can be made superconducting at very low temperatures. Furthermore, some oxides become superconducting at temperatures above 90 K.10 Superconductors have important applications in magnetic resonance imaging, high-speed and low-noise electronic devices, infrared sensors, and so forth. A similar phenomenon is the superfluidity in helium isotope 3 He, which undergoes a phase transition at very low temperatures. The fermionic 3He atoms pair up to form bosonic entities that experience Bose-Einstein condensation at 3 mK. For FD statistics, from Eq. (3.17), Eq. (3.21), and Eq. (3.22), we have Ni 1 gi e(eim)/kBT 1

(3.24)

As T S 0, it is found that Ni/gi 1 for all energy levels with ei m and Ni /gi 0 for energy levels with ei m. That is, all quantum states are filled for i 0, 1, 2, c, j (with j ), and all quantum states are empty for i j 1, j 2, c (with j 1 ), as schematically shown in Fig. 3.6. More discussions will be given in Chap. 5 on the behavior of free electrons. For now, it is sufficient to say that the thermodynamic probability 1 for FD statistics at absolute zero temperature. Therefore, the entropy S 0 at T S 0 K for both the BE and FD statistics. However, MB statistics does not satisfy the third law and is not applicable to very low temperatures. 3.2.5 Macroscopic Properties in Terms of the Partition Function The partition function is an important quantity in statistical thermodynamics. Unlike the characteristics functions (such as the Helmholtz free energy and the Gibbs free energy defined in Chap. 2) used in macroscopic thermodynamics, the physical meaning of the partition function is not immediately clear. However, the introduction of the partition function

CHAPTER 3

70

FIGURE 3.6 Schematic of the Fermi-Dirac distribution at 0 K.

allows the calculation of macroscopic thermodynamic properties from the microscopic representation. There are different types of partition functions. For MB statistics, the partition function is defined as Z Nea a gieei /kBT `

(3.25)

i0

Therefore,

N ei /kBT ge Z i

Ni

a gi e `

Since

c

'(ln Z) 1 'Z d a b 'T V,N Z 'T V,N

ei /kBT

i0

a

(3.26) ei kBT

a gie `

ei /kBT

b

2

Uea kBT 2 U Nea NkBT 2

i0

U NkBT 2 c

we have

'(ln Z ) d 'T V,N

(3.27)

Using the corrected MB statistics given in Eq. (3.18), we can express the entropy as S kB ln(MB/N!) kB a Ni a1 ln `

i0

gi b Ni

Z Z kB a Ni a1 ln bei b NkB NkBln kBbU N N i0 `

(3.28a)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

71

Had we not divided MB by N!, we would get S NkBln Z kBbU , which is different from Eq. (3.28a) by a constant. After substituting  and U into Eq. (3.28a), we obtain S NkB e 1 ln

'(ln Z) Z

Tc d f N 'T V,N

(3.28b)

The Helmholtz free energy is Z A U TS NkBTa1 ln b N

(3.29)

'(ln Z) 'A b NkBT c d 'V T,N 'V T,N

(3.30)

The pressure is P a

The enthalpy H and the Gibbs free energy G can also be obtained. The partition function is now related to the macroscopic thermodynamic properties of interest for simple substances.

3.3 IDEAL MOLECULAR GASES An important application of statistical mechanics is to model and predict the thermal properties of materials. In this section, the application of MB statistics to obtain the equation of state and the velocity distributions for ideal molecular gases is presented. The microscopic theories of the specific heat for ideal monatomic and polyatomic gases are given subsequently.

3.3.1 Monatomic Ideal Gases For a monatomic ideal gas at moderate temperatures, MB statistics can be applied, and the translational energies are e

1 1 m(v2x v2y v2z ) mv2 2 2

(3.31)

Consider a volume element in the phase space, dxdydzdpxdpydpz, where p mv is the momentum of a molecule. The accuracy of specifying the momentum and the displacement is limited by x px , h, given by the uncertainty principle. The degeneracy, which is the number of quantum states (boxes of size h3) in a volume element of the phase space, is given by dg

dxdydzdpxdpydpz 3

h

m3 dxdydzdvxdvydvz h3

(3.32)

Many useful results were obtained before quantum mechanics by assuming that h3 is some constant. A more rigorous proof of Eq. (3.32) will be given in Sec. 3.5. When the space between energy levels are sufficiently close, the partition function can be expressed in terms of an integral as Zt 1 ee/kBTdg or m m3 (v2 v2y v2z )d dvxdvydvz Zt 9dxdydz 9 3 exp c 2k T x h B

(3.33)

CHAPTER 3

72

The space integration yields the volume V, and the velocity integration can be individually performed, i.e., 3

`

`

exp a

Hence,

mv2x 2pkBT bdvx Å m 2kBT

Zt V a

2pmkBT 2

h

b

(3.34)

3/2

(3.35)

Therefore, ea

V 2pmkBT 3/2 b a N h2

(3.36)

which is indeed much greater than unity at normal temperatures for most substances, suggesting that the MB statistics is applicable for ideal molecular gases. At extremely low temperatures, intermolecular forces cannot be neglected and the molecules are not independent anymore. From Eq. (3.30), we have P NkBT ['(ln Z)>'V ]T,N NkBT/V ; i.e., PV NkBT

or

(3.37)

P nkBT

where n N/V is the number density. The Boltzmann constant is the ideal (universal) gas constant on the molecular basis, i.e., kB R/NA. The internal energy, the specific heats, and the absolute entropy can also be evaluated. U NkBT 2 c

'(ln Z) 3 d NkBT 'T V,N 2

(3.38) 3

which is not a function of pressure. The molar specific internal energy is u 2 RT , and the molar specific heats are

and

cv a

3 'u b R 'T V 2

(3.39)

cp a

5 'h b R 'T P 2

(3.40)

The above equations show that the specific heats of monatomic gases are independent of temperature, except at very high temperatures when electronic contributions become important. The molar specific heats do not depend on the type of molecules, but the same is not true for mass specific heats. Using Eq. (3.28b), the absolute entropy can be expressed as S NkB e

V 2pmkBT 3/2 5 b df

ln c a N 2 h2

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

73

Therefore, the molar specific entropy is a function of T and P, i.e., s(T,P) R e

kBT 2pmkBT 3/2 5

ln c b df a P 2 h2

(3.41)

This is the Sackur-Tetrode equation.

3.3.2 Maxwell’s Velocity Distribution Rewrite Ni gi eaeei/kBT as dN dg eaee/kBT . In a volume V and from v to v dv (i.e., vx to vx dvx, vy to vy dvy, and vz to vz dvz), the number of molecules dN per unit volume may be expressed as 3/2 N dN m 2 m3 h2 3 dvxdvydvz a b expa vb V V 2pmkBT 2kBT h

(3.42)

or f(v)dv

3/2 dN mv2 m na b expa bdv V 2pkBT 2kBT

(3.43)

where f(v) is the Maxwell velocity distribution in a unit volume. Notice that 3/2 f(v) mv2 m b expa b F(v) n a 2pkBT 2kBT

(3.44)

which is a Gaussian distribution. Notice that v2 v v v2 v2x v2y v2z. The distribution of velocity component is also Gaussian, such that F(v) F(vx)F(vy)F(vz)

(3.45)

Taking the x component as an example, we can write F(vx) a

1/2 mv2x m b expa b 2pkBT 2kBT

(3.46)

The speed distribution may be obtained from the following by integrating the velocity distribution in a spherical shell (i.e., over the solid angle of 4p). 3/2 mv2 m F(v)dv 6 F(v)dv 6 a b exp a b v2 ddv 2pkBT 2kBT 4p

4p

Therefore, F(v) 4pa

3/2 m m 2 vb b v2 exp a 2pkBT 2kBT

(3.47)

Figure 3.7 plots the speed distribution of He gas at 0, 300, and 800°C. When evaluating kBT , we must convert T to absolute temperature. It can be seen that more molecules will be at higher speeds as the temperature increases. It should be noted that F(v 0) 0 but F(v) is maximum at v 0. In the speed coordinate, an interval between v and v dv corresponds to a spherical shell in the velocity space. Even though F(v) is maximum at v 0,

CHAPTER 3

74

FIGURE 3.7 Speed distribution for helium gas at different temperatures.

the probability of finding a molecule per unit speed interval decreases to 0 as v S 0, which is caused by the associated decrease in the volume of the spherical shell. Example 3-5. Find the average speed and the root-mean-square speed for a He gas at 200°C at 100 kPa. What if the pressure is changed to 200 kPa? What are the most probable velocity and the most probable speed?

The average speed may be obtained from either the velocity distribution or the speed distribution. That is Solution.

` 8kBT v 9 vF(v)dv 3 vF(v)dv pm Å 0

(3.48)

The average of v2 is (see Appendix B.5)

` 3kBT v2 9 v2F(v)dv 3 v2F(v)dv m

(3.49a)

0

Therefore the root-mean-square speed is 3kBT vrms 2v2 Å m

(3.49b)

Plugging in the numerical values, we have v 1582 m/s and vrms 1717 m/s for He gas at 200°C. We also notice that the pressure has no effect on the speed distribution, unless it is so high that intermolecular forces cannot be neglected. The most probable velocity vmp 0 because of the symmetry in the Gaussian distribution. We can obtain the most probable speed by setting F(v) 0, i.e., 2v exp a

mv mv2 mv2 b v2 a b exp a b 0 2kBT kBT 2kBT

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

75

The solution gives the most probable speed as vmp !2kBT/m. For He gas at 200°C, it gives vmp 1402 m/s. Note that vmp : v¯ : vrms !2 :!8/p :!3 < 1.4 : 1.6 : 1.7.

An important consequence for Eq. (3.49a) is that temperature is related to the mean kinetic energy of the molecule, i.e.,

Comment.

1 1 1 1 2 mv mv2y mv2z kBT 2 x 2 2 2

(3.50)

The internal energy of a monatomic gas given in Eq. (3.38) is the sum of the kinetic energies of all molecules.

3.3.3 Diatomic and Polyatomic Ideal Gases Additional degrees of freedom or energy storage modes must be considered for diatomic and polyatomic molecules, besides translation. The molecule may rotate about its center of gravity, and atoms may vibrate with respect to each other. For a molecule consisting of q atoms, each atom may move in all three directions, and there will be a total of 3q modes. Consider the translation of the molecule as a whole; there are three translational degrees of freedom or modes, i.e., ft 3. For diatomic molecules or polyatomic molecules whose atoms are arranged in a line (such as CO2), as shown in Fig. 3.8, there are two rotational

FIGURE 3.8 (a) A diatomic molecule, showing two rotational and one vibrational degrees of freedom. (b) CO2 molecule, where the atoms are aligned.

degrees of freedom or modes, i.e., fr 2. Therefore, there are fv 3q 5 vibrational modes, each consisting of two degrees of freedom corresponding to the kinetic energy and the potential energy. For polyatomic molecules whose atoms are not aligned (such as H2O and CH4, see Fig. 3.9), there are three rotational degrees of freedom, i.e., fr 3. The vibrational modes are thus fv 3q 6. The total energy of a molecule may be expressed as the sum of translational, rotational, and vibrational energies, i.e., e et er ev. For simplicity, we have neglected contributions from the electronic ground state and chemical dissociation, which can be included as additional terms in evaluating the internal energy and the entropy.1 At high temperatures, the vibration mode can be coupled with the rotation mode. Here, however, it is assumed that these modes are independent. The partition function can be written as Z Zt Zr Zv a a gt eet/kBT b a a gr eer/kBT b a a gv eev/kBT b

(3.51)

CHAPTER 3

76

FIGURE 3.9 (a) A H2O molecule in which the atoms are not aligned. (b) The tetrahedral methane (CH4) molecule.

For polyatomic atoms, Eq. (3.31) through Eq. (3.36) hold for the translational modes. Zr and Zv are internal contributions that do not depend on volume; therefore, Eq. (3.37) also holds. Since the degrees of freedom are independent of each other, Maxwell’s velocity and speed distributions discussed in Sec. 3.3.2 still hold for polyatomic gases. The problem now is to determine the rotational and vibrational energy levels and degeneracies. Generally speaking, there exists a certain characteristic temperature associated with each degree of freedom. The characteristic temperature for translation is very low for molecular gases. On the other hand, the characteristic temperature for rotation is slightly higher, and that for vibration is usually very high, as can be seen from Table 3.2 for selected diatomic molecules. If the temperature is much less than the characteristic temperature of a certain mode, then the contribution of that mode to the energy storage is negligible. For the temperature much higher than the characteristic temperature, however, there often exist some asymptotic approximations. Rotation. A quantum mechanical analysis of a rigid rod, to be derived in Sec. 3.5.3, shows that the rotational energy levels are given by el r l(l 1) T kBT

(3.52)

TABLE 3.2 Characteristic Temperatures of Rotation and Vibration for Some Diatomic Molecules Substance Hydrogen Deuterium Hydrogen chloride Nitrogen Carbon monoxide Nitric oxide Oxygen Chloride Sodium vapor

Symbol

r (K)

v (K)

H2 D2 HCl N2 CO NO O2 Cl2 Na2

87.5 43.8 15.2 2.86 2.78 2.45 2.08 0.35 0.08

6320 4490 4330 3390 3120 2745 2278 814 140

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

77

Here, r is the characteristic temperature for rotation and is given by r h2/(8p2kBI), where I is the moment of inertia of the molecule about the center of mass. The larger the value of I, the smaller the characteristic temperature will be. This is clearly shown in Table 3.2. The degeneracy of rotational energy levels is gl

2l 1 s

(3.53)

where s is a symmetry number that arises from molecular symmetry: s 1 if the atoms are of different types (such as in a NO or CO molecule), and s 2 if the atoms are the same (such as in a O2 or N2 molecule). r 2l 1 s exp cl(l 1) T d l0

Zr a `

(3.54)

This series converges very fast for r /T 0.5, since 6r 12r 2r 1 b 5 exp a b 7 exp a b cd Zr s c1 3 exp a T T T For T/r 1, Eq. (3.54) may be expanded to give (see Problem 3.26) Zr

4 r 3 c T 1 r 2 1 r d a b

a b

c1 a b

3 T 15 T 315 T rs

(3.55)

At temperatures much higher than the characteristic temperature of rotation, T/r W 1, the above equation reduces to Zr

T sr

(3.56)

Under this limit, the contribution of the rotational energy to the internal energy becomes (3.57)

Ur < NkBT

The contribution to the molar specific heat by the two rotational degrees of freedom is cv,r R

(3.58)

Vibration. The vibration in a molecule can be treated as a harmonic oscillator. For each vibration mode, the quantized energy levels are given in Sec. 3.5.5 as 1 ev,i ai bhn, i 0, 1, 2, . . . 2

(3.59)

where n is the natural frequency of vibration, and the ground-state energy is 12hn. The vibrational energy levels are not degenerated, i.e., gv,i 1. Therefore, we can write Zv a e(i 1/2)hn/kBT ev/2T a eiv /T `

`

i0

i0

CHAPTER 3

78

where v hn/kB is a characteristic temperature for vibration and is listed in Table 3.2 for several diatomic molecules. The vibrational partition function becomes Zv

ev /2T ev /2T /T v /T 1e e v 1

(3.60)

Its contribution to the internal energy and the specific heat can be written as 1 1 Uv NkBv a /T b 2 e v 1 and

cv,v R

2v

ev /T T 2 (ev /T 1)2

(3.61)

(3.62)

At T V v, the vibrational mode contributes to the internal energy but not to the specific heat. At T 1.5v, Uv almost linearly depends on T and cv,v < R. In classical statistical mechanics, it is believed that each degree of freedom contributes to the stored thermal energy with an amount of 21 kBT and results in a specific heat of 12 kB on the particle base. This is called the equipartition principle. The contribution of each vibrational mode is R not R/2, due to the fact that each vibrational mode includes a kinetic component and a potential component for energy storage and is generally considered as two degrees of freedom. It should be noted that the equipartition principle is only applicable at sufficiently high temperatures and for particles that obey MB statistics or, in some limiting cases, BE statistics. Because energy is additive, as is the specific heat, we can write cv cv,t cv,r cv,v

(3.63)

The result is schematically shown in Fig. 3.10. One can see that for a diatomic ideal gas, cv 2.5R if

r V T V v

FIGURE 3.10 Typical specific heat curve of a diatomic ideal gas.

(3.64)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

79

which happens to be near room temperature for many gases such as nitrogen and carbon monoxide; see Table 3.2. Figure 3.11 plots the specific heat for several real gases at sufficiently

FIGURE 3.11 Specific heat at constant volume for several ideal gases.

low pressure so that the ideal gas model is applicable. It should be noted that, for hydrogen, nuclear spin is important and Eq. (3.54) needs to be modified to account for the spin degeneracy.1,2 However, Eq. (3.57) and Eq. (3.58) predict the right trend and are applicable at temperatures much higher than r. At extremely high temperatures (say 3000 K), electronic contributions and the coupling between rotation and vibration become important. Although Eq. (3.63) is the correct expression for the specific heat at moderate temperatures, two additional partition functions must be included to correctly evaluate the internal energy and the entropy (see Problem 3.22). We limit the derivations to the specific heat, which is closely related to heat transfer calculations. The characteristic temperature for rotation is usually very small for polyatomic molecules because of their large moments of inertia. Therefore, the rotational degrees of freedom can be assumed as fully excited in almost any practical situation. Each rotational degree of freedom will contribute R/2 to the molar specific heat. For molecules whose atoms are aligned (such as CO2), the rotational contribution to the specific heat is R, and 3q5 z2i ezi 5 , zi v,i /T cv R R a z 2 i 2 i1 (e 1)

(3.65)

If T W v,i, then cv S R(3q 2.5). For molecules such as H2O and CH4 whose atoms are not aligned, we have, 3q6 z2i ezi cv 3R R a z 2 i i1 (e 1)

(3.66)

CHAPTER 3

80

TABLE 3.3 Vibrational Modes of Several Gases, Where the Integer in the Parentheses Indicates the Number of Degenerate Modes Type

cm1

cm1

cm1

cm1

Total fv

CO2 H2O CH4

667 (2) 1595 1306 (3)

1343 3657 1534 (2)

2349 3756 2916

– – 3019 (3)

4 3 9

In this case, cv S R(3q 3) at T W v,i. Again, electronic contribution may be significant at very high temperatures. Table 3.3 lists the vibrational frequencies for several commonly encountered gases. The unit of frequency is given in inverse centimeter (cm1), which is often used in spectroscopic analyses. Note that v hnkB hc0nkB, where n is the wavenumber in cm1 if we take c0 3 1010 cm/s. That is v (K) 1.44 n (cm1). One can use this table to estimate the specific heat of these gases based on Eq. (3.65) or Eq. (3.66). In reality, vibration-rotation interactions result in multiple absorption lines around each vibration mode, which can be observed through infrared absorption spectroscopy. Figure 3.12

FIGURE 3.12 Infrared absorption spectrum of ambient air obtained with a Fourier-transform infrared spectrometer.

shows the molecular absorption spectra of CO2 and H2O measured with a Fourier-transform infrared spectrometer. The absorption spectra were obtained by comparing the spectrum when the measurement chamber is open with that when the chamber is purged with a nitrogen gas, which does not absorb in the mid-infrared region. The concentrations of H2O and CO2 in the experiments were not controlled since the purpose is to demonstrate the infrared absorption frequencies only. While the resolution of 1 cm1 is not high enough to resolve very fine features, the absorption bands near 670 cm1 due to degenerate bending modes and near 2350 cm1 due to asymmetric stretching mode in CO2 can be clearly seen. Note that the symmetric vibration mode of CO2 at 1343 cm1 is infrared inactive, i.e., it does not show up in the absorption spectrum but can be observed with Raman spectroscopy. Furthermore, the vibration-rotation interactions cause multiple lines in the water vapor absorption bands from 1300 to 2000 cm1 and from 3500 to 4000 cm1.

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

81

How many rotational degrees of freedom are there in a silane (SiH4) molecule? If a low-pressure silane gas is raised to a temperature high enough to completely excite its rotational and vibrational modes, find its specific heats.

Example 3-6.

For SiH4, there will be three translational degrees of freedom, i.e., t 3, three rotational degrees of freedom, i.e., r 3, and fv 3q 6 9 vibrational degrees of freedom. If all the modes are excited, the specific heat for constant volume will be cv 1.5R 1.5R 9R 12R. Given that M 32, we find cv 3.12 kJ/(kg K), cp 3.38 kJ/(kg K), and g 13R/12R 1.083. The actual specific heats would be much smaller at moderate temperatures. Solution.

3.4 STATISTICAL ENSEMBLES AND FLUCTUATIONS We have finished the discussion about statistical thermodynamics of independent particles without mentioning ensembles. In a system of independent particles, there is no energy associated with particle-particle interactions or the configuration of the particles. For dependent particles or dense fluids, the previous analysis can be extended by using statistical ensembles, which was pioneered by J. Willard Gibbs (1839–1903) in the late nineteenth century in his 1902 book, Elementary Principles of Statistical Mechanics. Statistical ensembles are a large set of macroscopically similar systems. When the properties are averaged over a properly chosen ensemble, the macroscopic properties can be considered as the same as the time-averaged quantity of the same system. There are three basic types of ensembles: microcanonical ensemble, canonical ensemble, and grand canonical ensemble.1,5 A microcanonical ensemble is composed of a large set of identical systems. Each system in the ensemble is isolated from others by rigid, adiabatic, and impermeable walls. The energy, volume, and number of particles in each system are constant. The results obtained using the microcanonical ensemble for independent particles are essentially the same as what we have obtained in previous sections. It is natural to ask the question as to what extent the statistical mechanics theory presented in previous sections will be valid for nanosystems. If the equilibrium properties are defined based on a large set of microcananical ensembles and considered as the time-averaging properties of the system, there will be sufficiently large number of particles in the whole ensemble to guarantee the basic types of statistics, and the thermodynamics relations derived in Secs. 3.1 and 3.2 are still applicable. On the other hand, the difference between the energy levels due to quantization may be large enough to invalidate the substitution of summation with integration. We will discuss the energy level quantization further in Sec. 3.5. In deriving the properties of ideal gases in Sec. 3.3, the consideration of the translational, rotational, and vibrational degrees of freedom is on the basis of individual molecules. Therefore, the conclusions should be applicable to systems under thermodynamic equilibrium. In a canonical ensemble, each system is separated from others by rigid and impermeable walls, which are diathermal. All systems have the same volume and number of particles. However, the systems can exchange energy. At equlibrium, the temperature T will be the same for all systems. An important result of applying the canonical ensemble is that the energy fluctuation (i.e., the standard deviation of energy of the system) is proportional to 1/!N , where N is the total number of independent particles. In a grand canonical ensemble, each system is separated from others by rigid, diathermal, and permeable walls. While the volume is fixed and is the same for each system, the number of particles as well as the energy of each system can vary. The temperature and the chemical potential must be the same for all systems at equilibrium. This allows the study of density fluctuations for each system. The result for monatomic molecules yields that the density fluctuation is also proportional to 1/!N .

CHAPTER 3

82

The canonical and grand canonical ensembles are essential for the study of complex thermodynamic systems, such as mixtures, chemical equilibria, dense gases, and liquids, which will not be further discussed in this text. Interested readers can find more details from Tien and Lienhard1 and Carey.5 A simple theory based on independent particles of phonons and electrons will be discussed in Chap. 5. While the partition function can also be used to study the thermodynamic relations of solids, the approach used in solid state physics will be adopted in a detailed study of the properties of solids presented in Chap. 7.

3.5 BASIC QUANTUM MECHANICS So far we have largely avoided the derivations and equations involving quantum mechanics, by using the conclusions from quantum theory on a need basis without proof. In this section, we shall present the basics of quantum mechanics to enhance the understanding of the materials already presented and to provide some background for future chapters. In classical mechanics, the state of a system is completely described by giving the position and the momentum of each particle in the system at any given time. The equation of motion is given in Eq. (3.1), which is also the basis for molecular dynamics. The position and the momentum of each particle are precisely determined using the initial values and the forces exerted on it afterward. According to the wave-particle duality, particles also have wave characteristics. The results are described in quantum mechanics by the Schrödinger wave equation. The solution of the Schrödinger equation is given in the form of a wavefunction, which describes the probabilities of the possible outcome rather than the exact position and momentum of the particle. Another important aspect in quantum mechanics is the use of operators in mathematical manipulations.

3.5.1 The Schrödinger Equation Consider the following equation that describes a wave in the x direction (see Appendices B.6 and B.7): ~ (3.67) (x,t) Aei(2px/l2pnt) ~ where A Ar iAs is a complex constant, l is the wavelength, and n is the frequency. One can take the real part of , i.e., Re() Arcos(2px/l 2pnt) Assin(2px/l 2pnt) which is a cosine function of x for any given t. The complex notation is convenient for obtaining derivatives. If Eq. (3.67) is used to describe a moving particle, with a mass m and a momentum p, it can be shown that iU

' h p 'x l

p2 U2 '2 EK 2 2m 'x 2m

(3.68a)

(3.68b)

' (3.68c) iU hn e 't where U h/2p, which is the Planck constant divided by 2p, EK is the kinetic energy of the particle, and e is the total energy of the particle. In writing Eq. (3.68), we have applied and

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

83

the concept of wave-particle duality to relate p h/l and e hn. If the particle possesses only the kinetic and potential energies, we have e EK EP

p2

(r) 2m

(3.69a)

where (r) (x,y,z) is the potential function that depends on the position of the particle. Define the Hamiltonian operator in the 3-D case as ^

H

U2 2 = (r) 2m

(3.69b)

^

It can be seen that H e. Hence,

' U2 2 = (r) iU 't 2m

(3.70)

which is the time-dependent Schrödinger equation.8 From e iU 't , one can obtain '

(r,t) 0(r)eiet/U

(3.71a)

The general time dependence for different energy eigenvalues can be written as a summation: (r,t) A101(r)eie1t/U A202(r)eie2t/U c

(3.71b)

Therefore, the key to solve the Schrödinger equation becomes how to obtain the initial wavefunctions. For this reason, Eq. (3.70) can be rewritten as follows:

U2 2 = (r) e 2m

(3.72)

which is called the time-independent Schrödinger equation. The solution gives the wavefunction (r), which is often expressed in terms of a set of eigenfunctions, 1,2,3, c, each with an eigenvalue energy, e1,e2,e3, c , respectively. The solution, or the wavefunction, must satisfy * 3 dV 1

(3.73)

V

where the superscript * denotes the complex conjugate since the wavefunction is in general complex, and the integration is over the whole volume. The physical significance is that the probability of finding the particle in the volume must be 1. The wavefunction is also called a state function because it describes the quantum state of the particle, and * is called the probability density function. The average or expectation value of any physical quantity h is calculated by ^ kl 3 * dV V

(3.74)

^ where signifies an operator of . For example, the average energy of the particle is

kel 3 *HdV ^

V

(3.75)

CHAPTER 3

84

Several examples are discussed in the following sections to show how to obtain the wavefunctions and the physical significance of the solutions. 3.5.2 A Particle in a Potential Well or a Box The1-D potential well is illustrated in Fig. 3.13a, where a particle is confined within a physical space between 0 x L and the particle can move parallel to the x axis only. This is

FIGURE 3.13 Illustration of (a) a 1-D potential well and (b) the eigenfunctions.

equivalent of saying that the potential energy is zero inside and infinite outside the potential well, i.e., (x) e

0, for 0 x L `, at x 0 and L

(3.76)

U2 2 = e 2m

(3.77)

The Schrödinger equation becomes

whose solutions are (x) A cos (kx) B sin (kx), where k !2me/U2. Because the particle is confined inside the well, the wavefunction must be zero outside the potential well. Another requirement for the wavefunction is that it must be continuous. Thus, we must have (0) (L) 0. This requires that A 0 and, by taking only the positive k values, we have kL np, n 1, 2, 3 c

(3.78)

The eigenfunctions are therefore n(x) Bn sin (npx/L), which can be normalized by letting L

* 3 n(x)n(x)dx 1 to get

0

n(x)

npx 2 sin a b ÅL L

(3.79)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

85

Therefore, the solution requires the particle to possess discretized energy values, i.e., its energy cannot be increased continuously but with finite differences between neighboring states. It can easily be seen that

en

h2n2 8mL2

(3.80)

The quantized energy eigenvalues are called energy levels for each quantum state, and the index n is called a quantum number. The eigenfunctions are standing waves as shown in Fig. 3.13b for the first four quantum states. For molecules, the difference between energy levels is very small and the energy distribution can often be approximated as a continuous distribution. For electrons at very small distances, L S 10 nm for example, quantization may be important. The effects of quantum confinement take place when the quantum well thickness becomes comparable to the de Broglie wavelength of the particle, such as electrons or holes in a semiconductor. Quantum wells can be formed by a sandwiched structure of heterogeneous layers, such as AlGaAs/GaAs/AlGaAs. The bandgap of the outer layers is larger than that of the inner layers to form an effective potential well. These structures are used for optoelectronic applications such as lasers and radiation detectors. The thickness of the active region can be a few nanometers. In some cases, multiple quantum wells are formed with periodic layered structures, called superlattices, which have unique optical, electrical, and themal properties. Derive the uncertainty principle. Suppose the wavefunction is given by Eq. (3.79) for a particle with energy n given in Eq. (3.80).

Example 3-7.

Solution. To find the average position of the particle, we use npx 2 L kxl 3 *xdx 3 x sin 2 a bdx L 0 L 2 0 L

L

The variance of x, s2x kx kxll2 kx2 l 2kxl2 kxl2 kx2 l kxl2. With npx 2 L2 L2 bdx x 2 sin 2 a 2 2 L 30 L 3 2n p L

kx2 l

we obtain the standard deviation of x as sx La

1 1 2 2b 12 2n p

1/2

For the momentum, we use the operator p S iU('/'x). Hence, d npx npx 2np bcosa bdx 0 k pl 3 * aiU bdx iU 2 3 sin a L L dx L 0 0 L

L

d 2 npU kp2 l 3 *(U2) 2 dx a b L dx 0 L

and

2

We have sp npU /L and obtain the following expression: sxsp

1/2 U p2n2 a 2b 2 3

(3.81)

CHAPTER 3

86

Taking the smallest quantum number, n 1, we get sxsp < 0.5678U U/2, which is a proof of the uncertainty principle given in Eq. (3.9).

Next, consider a free particle in a 3-D box, 0 x a, 0 y b, 0 z c. It can be shown that the (normalized) eigenfunctions are x,y,z

nypy nzpz nxpx 8 sin a a b sin a b sin a c b Å abc b

(3.82)

with the energy eigenvalues:

ex,y,z

2 n2y n2z h2 nx a 2 2 2b 8m a b c

(3.83)

where nx,ny,nz 1,2,3, c When a b c V1/3, Eq. (3.83) can be simplified as ex,y,z

h2 (n2 n2y n2z ) 8mV 2/3 x

(3.84)

Let (n2x n2y n2z )1/2, then we can evaluate the number of quantum states between and d, which is nothing but the degeneracy. For sufficiently large V, the quantum states are so close to each other that the volume within the spherical shell between h and h dh is equal to the number of quantum states. Only one-octant of the sphere is considered in Eq. (3.84) because nx 0, ny 0, nz 0. The total volume is therefore one-eighth of the spherical shell; hence, 2pV(2m)3/2 1/2 1 dg 4p2d e de 8 h3

(3.85)

With e 12mv2 and de mvdv, we obtain dg

m3V 4pv2dv h3

(3.86)

This equation is essentially the same as Eq. (3.32), with dxdydz V and dvx dvydvz 4pv2dv. Equation (3.86) provides a rigid proof of Eq. (3.32), which is the translational degeneracy. It should be noted that the classical statistical mechanics results in the same expression for U and p, as well as the Maxwell velocity distribution for ideal gases. However, the constant h must be included to correctly express S as in Eq. (3.41). Equation (3.86) will also be used in Chap. 5 to study the free electron gas in metals. When using the momentum p mv as the variable, we have dg

V 4pp2dp h3

(3.87)

Because Eq. (3.87) does not involve mass, it is also applicable to phonons and photons as will be discussed in Chaps. 5 and 8.

3.5.3 A Rigid Rotor The rigid rotor model can be used to study the rotational movement of diatomic molecules as well as the movement of an electron in a hydrogen atom. Consider two particles separated by a fixed distance r0 r1 r2 as shown in Fig. 3.14. The masses of the particles are

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

87

FIGURE 3.14 Schematic of a rotor consisting of two particles.

m1 and m2, respectively. Since the center of mass is at the origin, we have m1r1 m2r2. The moment of inertia is I m1r21 m2r22 mrr20

(3.88)

where mr m1m2/(m1 m2) is the reduced mass. We can study the rotational movement of the particles by considering a particle with a mass of mr that rotates around at a fixed distance r r0 from the origin in the u and f directions. In the spherical coordinates, =2

' ' '2 1 ' 2' 1 1 ar a sin u b 2 2 b 2 2 'r 'r 'u 'u r r sin u r sin u 'f2

(3.89)

Because r ; r0, the derivative with respect to r vanishes. The potential energy is zero for free rotation. By setting the mass to be mr and 0 in Eq. (3.72) and noticing that mrr20 I , we obtain 2Ie ' 1 ' 1 '2 2 asin u b

'u sin u 'u sin 2u 'f2 U

(3.90)

This partial differential equation can be solved by separation of variables. We get two ordinary differential equations by letting (u,f) P(u)c(f), i.e.,

and

d 2c m2c df2

(3.91)

dP 2Ie m2 1 d asin u b a 2 bP 0 sin u du du U sin 2 u

(3.92)

Here, m is a new eigenvalue, and the periodic boundary conditions shall be applied to P and c, respectively. The solution of Eq. (3.91) is readily obtained as (3.93) c(f) Aeimf

CHAPTER 3

88

with m 0,1,2, c , to satisfy the periodic boundary conditions: c(f) c(2p f). A transformation, cos u j, can be used so that Eq. (3.92) becomes (1 j2)

d2P dP 2Ie m2

a 2 2j bP 0 2 dj dj U 1 j2

(3.94)

Because u is defined from 0 and p, we have 1 x 1. In order for Eq. (3.94) to have solutions that are bounded at x 1, 2Ie/U2 l(l 1), where l is an integer that is greater than or at least equal to the absolute value of m. Therefore, the energy eigenvalues are el

U2 l(l 1), l Z m Z , Z m Z 1, Z m Z 2, etc. 2I

(3.95)

Equation (3.94) is called the associated Legendre differential equation. The solutions are the associated Legendre polynomials given as Pml(j)

(1 j2)m/2 dm l 2 (j 1)l l!2l djm l

(3.96)

Finally, after normalization, the standing wavefunctions can be expressed as ml(u,f)

c

(2l 1)(l m)! 1/2 m d Pl (cos u)eimf 2(m 1)! 22p 1

(3.97)

Discussion. It can be seen that Eq. (3.95) is identical to Eq. (3.52). The energy level is determined by the principal quantum number l. On the other hand, for each l, there are 2l 1 quantum states corresponding to each individual m, because m can take 0, 1, 2 up to l. This means that the degeneracy gl 2l 1. When the two atoms are identical, such as in a nitrogen molecule, the atoms are indistinguishable when they switch positions. The degeneracy is reduced by a symmetry number, as given in the expression of Eq. (3.53). It should be noted that the nuclear spin degeneracy is important for hydrogen (see Problem 3.27).1

3.5.4 Atomic Emission and the Bohr Radius A hydrogen atom is composed of a proton and an electron. Since the mass of the proton is much greater than that of the electron, it can be modeled as the electron moving around the nucleus. The mass of the electron is me 9.11 1031 kg, and the position of the electron can be described in the spherical coordinates as r (r,u,f). The force exerted on the electron is Coulomb’s force, which gives a potential field C1 (r) r

(3.98)

where C1 e2/(4pe0) 2.307 1028 N # m2, with the electron charge e 1.602 1019 C and the dielectric constant e0 8.854 1012 F/m. Let (r,u,f) R(r)P(u)c(f). In doing the separation of variables, we notice that the potential is independent of u and f, and the total energy is equal to the sum of the rotational energy and the energy associated with r. The eigenvalues for the rotational energy are given in Eq. (3.95). Using Eq. (3.72) and Eq. (3.89), we can write the equation for R(r) as follows: C1 l(l 1)U2 dR d U ar2 b a r e bR 0 2 dr 2I dr 2me r

(3.99)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

89

which is the associated Laguerre equation, and its solutions are the associated Laguerre polynomials. The solutions give the energy eigenvalues as5,8 en

meC21 2U2n2

(3.100)

where the negative values are used for convenience to show that the energy increases with the principal quantum number n. For n 1, meC12/2U2 13.6 eV, as shown in Fig. 3.15.

FIGURE 3.15 Electron orbits (left) and energy levels (right) in a hydrogen atom. The ionization energy is the energy required for an electron to escape the orbit.

Note that 1 eV 1.602 1019 J. When the electron is in a higher energy state, it has a tendency of relaxing to a lower energy state by spontaneously emitting a photon, with precisely the same energy as given by the energy difference between the two energy levels: hn ei ej

meC21 1 1 a 2b 2U2 n2j ni

(3.101)

The emission or absorption of photons by electrons is called electronic transitions. When i 3 and j 1, we have hn 12.1 eV, corresponding to the wavelength of 102.6 nm (ultraviolet), which is the second line in the Lyman series. When i 3 and j 2, we have hn 1.89 eV, corresponding to the wavelength of 656.4 nm (red), which is the first line in the Balmer series. A more detailed description of the atomic emission lines can be found from Sonntag and van Wylen.2 The next question is: What is the radius of a particular electron orbit? This is an important question because it gives us a sense of how small an atom is. When a particle is in an orbit, the classical force balance gives that C1 v2 (3.102) me a r b 2 r which is to say that EK mev2/2 C1/2r, and the sum of the kinetic and potential energies is e EK EP

C1 C1 C1 r 2r 2r

(3.103)

Equations (3.100) and (3.103) can be combined to give discrete values of the radius of each orbit in the following: U2 2 (3.104) n a0n2 rn meC1

CHAPTER 3

90

When the electron is in the innermost orbit, the radius is given by a0 e0h2/(pme e2) 0.0529 nm, which is called the Bohr radius. Therefore, the hydrogen atom in its ground state can be considered as having a diameter of approximately 1 Å (Angstrom), or 0.1 nm. Niels Bohr (1885–1962) was a Danish physicist who received the Nobel Prize in Physics in 1922 for his contributions to the understanding of the structure of atoms and quantum physics. One should accept the quantum interpretation of the electron radius as a characteristic length, not the exact distance that the electron would rotate around the nucleus in the same manner a planet rotates around a star. 3.5.5 A Harmonic Oscillator The last example of quantum mechanics is the linear spring as shown in Fig. 3.16. Consider a 1-D oscillator with a mass m and the spring force F(x) Kx. The origin can be selected such that F(0) 0. It can be shown that the potential is x

1 (x) 3 F(x)dx Kx2 2 0

(3.105)

FIGURE 3.16 A linear spring.

$ From classical mechanics, we can solve Newton’s equation mx Kx 0 to obtain the solution x A sin (vt f0)

(3.106)

where constant A is the amplitude, constant f0 is the initial phase, and parameter v !K/m is the angular resonance frequency. It can be shown that the total energy e EK EP KA2/2 is a constant and the maximum displacement is A. The velocity is the largest at x 0 and zero at x A. The Schrödinger wave equation can be written as Kx2 U2 d2 b 0

ae 2m dx2 2

(3.107)

with the boundary condition being (x) 0 at x S ` . The constants can be grouped by using a 2me/U2 and b !Km/U. Then Eq. (3.107) can be transformed by using j !bx and (x) Q(j) exp ( j2/2) to d 2Q dQ a 2j

a 1bQ 0 b dj dj2

(3.108)

This is the Hermite equation, and the solutions are Hermite polynomials given by 2

Hn(j) (1)nej

dn j2 ae b djn

(3.109)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

91

when a and b must satisfy the eigenvalue equation: a 1 2n, b

n 0, 1, 2, c

(3.110)

The normalized wavefunctions can be written as n(x) a

bx2 2b/p 1/2 b Hn(b1/2x) exp a b n!2n 2

(3.111)

The energy eigenvalues can be obtained from Eq. (3.110) as 1 1 en an bU 2K/m an bUv 2 2

(3.112)

The above equation was used to study the vibrational contributions in diatomic molecules; see Eq. (3.59). The 1/2 term was not included in Planck’s original derivation of the blackbody radiation function. The significance lies in that if the ground-state energy is zero, both its kinetic energy and potential energy must be zero, suggesting that both the position and the momentum must be zero. This would violate the uncertainty principle. As mentioned earlier, in classical mechanics, the particle is limited to the region A x A, where A is the amplitude given in Eq. (3.106). This is not the case in the quantum theory, as shown in Fig. 3.17, for the first few energy levels and the associated wavefunctions. Notice that probability density function 2 is nonzero even though the absolute value of x exceeds !2e/K .

FIGURE 3.17 Wavefunctions and probability density functions for vibration energy levels.

The application of quantum theory allows us to predict the specific heat of ideal gases. In deriving the equations shown in Sec. 3.3.3, we have largely neglected nonlinear and anharmonic vibration, electronic contribution, and dissociation. These factors may become important at very high temperatures. The degeneracy due to the coupling of rotation and vibration can cause multiple absorption/emission lines in the infrared in polyatomic molecular gases, as shown in Fig. 3.12.

92

CHAPTER 3

3.6 EMISSION AND ABSORPTION OF PHOTONS BY MOLECULES OR ATOMS We have learned that the emission of photons is associated with transitions from a higher energy level to a lower energy level that reduces the total energy of the molecular system. The reverse process is the absorption of photons that increases the energy of the system through transitions from lower energy levels to higher energy levels. As discussed earlier, an electronic transition requires a large amount of energy, and the emitted or absorbed photons are at frequencies from deep ultraviolet (l < 100 nm) to slightly beyond the red end of the visible region (l < 1 mm). On the other hand, vibration or rotation-vibration modes lie in the mid-infrared (2.5 mm l 25 mm), while their overtones or higher-order harmonics lie in the near-infrared region (0.8 mm l 2.5 mm). Rotational modes alone may be active in the far-infrared and microwave regions (l 25 mm). Transitions between different energy levels of the molecules or atoms are called bound-bound transitions, because these energy states are called bound states. Bound-bound transitions happen at discrete frequencies due to quantization of energy levels. Dissociation or ionization can also occur at high temperatures. The difference between adjacent energy levels is very small because the electrons can move freely (i.e., not bound to the atom or the molecule). Therefore, free-free or bound-free transitions happen in a broadband of frequencies. In gases, these broader transitions occur only at extremely high temperatures. If a molecule at elevated energy states were placed in a surrounding at zero absolute temperature (i.e., empty space), it would lower its energy states by emitting photons in all directions until reaching its ground state. However, the emission processes should occur spontaneously regardless of the surroundings. Suppose the molecule is placed inside an isothermal enclosure, after a long time, the energy absorbed must be equal to that emitted to establish a thermal equilibrium with its surroundings. The thermal fluctuation of oscillators is responsible for the equilibrium distribution, i.e., Planck’s law developed in 1990. Einstein examined how matter and radiation can achieve thermal equilibrium in a fundamental way and published a remarkable paper, “On the quantum theory of radiation” in 1917.11 The interaction of radiation with matter is essentially through emission or absorption at the atomistic dimension, although solids or liquids can reflect radiation and small particles can scatter radiation. Einstein noticed that spontaneous emission and pure absorption (i.e., transition from a lower level to a higher level by absorbing the energy from the incoming radiation) alone would not allow an equilibrium state of an atom to be established with the radiation field. He then hypothesized the concept of stimulated or induced emission, which became the underlying principle of lasers. In a stimulated emission process, an incoming photon interacts with the atom. The interaction results in a transition from a higher energy state to a lower energy state by the emission of another photon of the same energy toward the same direction as the incoming photon. Saying in other words, the stimulated photon is a clone of the stimulating photon with the same energy and momentum. Whether an incoming photon will be absorbed, will stimulate another, or will pass by without any effect on the atom is characterized by the probabilities of these events. Understanding the emission and absorption processes is important not only for coherent emission but also for thermal radiation.12 While more detailed treatments will be given in later chapters, it is important to gain a basic understanding of the quantum theory of radiative transitions and microscopic description of the radiative properties. Consider a canonical ensemble of single molecules or atoms, with two nondegenerate energy levels, e1 and e2 (e2 e1), in thermal equilibrium with an enclosure or cavity at temperature T. Suppose the total number of particles is N, and let N1 and N2 be the number of particles at the energy level corresponding to e1 and e2, respectively. These particles do not interact with each other at all. The concept of canonical ensemble can be understood as if each cavity has only one atom, but there are N single-atom cavities with one atom in each cavity. As shown in Fig. 3.18, there are three possible interaction mechanisms, i.e., spontaneous

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

93

FIGURE 3.18 Illustration of the emission and absorption processes. (a) Spontaneous emission. (b) Stimulated emission. (c) Stimulated absorption.

emission, stimulated emission, and stimulated or induced absorption. Here, stimulated absorption refers to the process that the energy of the photon is absorbed, and consequently, the transition occurs from the lower energy level to the higher energy level. In a stimulated absorption process, the number of photons before the process is 1 and after the process is 1 1 0. In a stimulated emission process, the number of photons beforehand is 1 and afterward is 1 1 2. Therefore, stimulated emission is regarded also as negative absorption. Each of the photons involved in this process will have an energy equal to hn e2 e1 and a momentum hn/c. Transition from the higher energy level to the lower energy level cannot take place if the population of atoms on the higher energy level, N2 0, and vice versa. Einstein further assumed that the probability of transition is proportional to the population at the initial energy level, and spontaneous transition should be independent of the radiation field. Hence, the rate of transition from e2 to e1 due to spontaneous emission can be written as a

dN1 dN2 b a b AN2 dt A dt A

(3.113)

where A is Einstein’s coefficient of spontaneous emission. On the other hand, the transition rate due to stimulated emission should also be proportional to the energy density of the radiation field u(n,T). Thus, a

dN1 b BN2u(n,T) dt B

(3.114)

Stimulated absorption will cause a transition rate that is proportional to N1 and u(n,T): a

dN1 b CN1u(n,T) dt C

(3.115)

In Eq. (3.114) and Eq. (3.115), constants B and C are Einstein’s coefficients of stimulated emission and absorption, respectively. The combination of these processes must maintain a zero net change of the populations at equilibrium. Thus, AN2 BN2u(n,T) CN1u(n,T) 0

(3.116)

Atoms or molecules in a thermal equilibrium are described by the Maxwell-Boltzmann statistics of molecular gases given by Eq. (3.26): N1/N2 e(e2e1)/kBT ehn/kBT . Therefore, Eq. (3.116) can be rewritten as u(n,T)

A/B (C/B)ehn/kBT 1

(3.117)

94

CHAPTER 3

Comparing this equation with Planck’s distribution, Eq. (8.41) in Chap. 8, we see that B C and A/B 8phn3/c3. The two-level system can easily be generalized to arbitrary energy levels to describe the fundamental emission and absorption processes. The emission and absorption processes not only exchange energy between the field and the atom but also transfer momentum. How will an atom move inside a cavity? The phenomenon of a molecule or atom in a radiation field is like the Brownian motion, in which the radiation quanta exert forces on the molecule or the atom as a result of momentum transfer during each emission or absorption process. Consequently, the molecule or the atom will move randomly following Maxwell’s velocity distribution at the same temperature as the radiation field. The equilibrium radiation field, which obeys the quantum statistics (i.e., BE statistics) that was not realized until 1924, and the motion of a molecular gas, which obeys classical statistics, can be coupled to each other to become mutual equilibrium. Einstein also asserted that each spontaneously emitted photon must be directional, while the probability of spontaneous emission should be the same in all directions. In fact, Einstein’s 1917 paper complemented Planck’s 1900 paper on radiation energy quanta and his own 1905 paper on photoelectric emission and, thus, provided a complete description of the quantum nature of photons, although the name “photon” was not coined until 1928. At moderate temperatures, the population at higher energy states is too small for stimulated emission to be of significance for optical and thermal radiation. Thus, the absorption comes solely from induced absorption. When stimulated emission is important, the contributions of stimulated emission and stimulated absorption cannot be separated by experiments. The effect is combined to give an effective absorption coefficient by taking stimulated emission as negative absorption, whereas the emission of radiation includes solely the spontaneous emission.12 The effective absorption coefficient is proportional to the population difference, N1 N2. On the other hand, if a population inversion can be created and maintained such that N2 N1, the material is called a gain medium or active medium. In an active medium, stimulated emission dominates stimulated absorption so that more and more photons will be cloned and the radiation field be amplified coherently. The principle of stimulated emission was applied in 1950s and early 1960s for the development of maser, which stands for microwave amplification by stimulated emission of radiation, and laser, which stands for light amplification by stimulated emission of radiation.13 Lasers have become indispensable to modern technologies and daily life.

3.7 ENERGY, MASS, AND MOMENTUM IN TERMS OF RELATIVITY Special theory of relativity or special relativity predicts that energy and mass can be converted to each other. If we retain the definition of mass as in the classical theory, only energy conservation is the fundamental law of physics. The mass does not have to be conserved. On the other hand, for processes that do not involve changes below the atomic level or inside the nuclei, the mass can indeed be considered as conserved. According to the special relativity, the rest energy of a free particle is related to its mass and the speed of light by E0 mc2

(3.118)

The rest energy is simply the energy when the particle is not moving relative to the reference frame. Suppose the free particle is moving at a velocity v in a given reference frame, then its momentum is given by14 p

mv 21 v2/c2

(3.119)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

95

When v V c, Eq. (3.119) reduces to the classical limit, i.e., p mv. It can be seen that for a particle with nonzero mass, its momentum would increase as v S c without any bound. There is no way we could accelerate a particle to the speed of light. If there is anything that travels with the speed of light, it has to be massless, i.e., m 0. An example of massless particles is the light quanta or photons. The kinetic energy can be evaluated by integrating the work needed to accelerate a particle, i.e., x dp v dp x x dp dv dx 3 dx 3 vdv EK 3 Fdx 3 0 dv dt 0 dv 0 0 dt

Using Eq. (3.119), we find that EK

mc2 21 v2/c2

mc2

(3.120)

When v V c, we have 1/!1 v2/c2 < 1 v2/2c2 so that EK mv2/2 p2/2m in the low-speed limit. In the relativistic limit, however, EK will be on the order of mc2. Because energy is additive, the total energy of a moving free particle is E EK E0

mc2 21 v2/c2

(3.121)

Obviously, the energy of a particle would become infinite if its speed approaches the speed of light, unless its mass goes to zero. It can be shown that E2 E 20 m2c4/(1 v2/c2) m2c4 p2c2, where p is given in Eq. (3.119). This gives another expression of energy in terms of the rest energy, the momentum, and the speed of light as follows: E2 m2c4 p2c2

(3.122)

It should be noted that, in general, pc is not equal to the kinetic energy. For v V c, the total energy is approximately the same as the rest energy. Comparing Eq. (3.119) and Eq. (3.121), we notice that E pc(c/v). Therefore, when v S c, we see that E S pc (which is unbounded unless m 0). For a photon that travels at the speed of light, in order for the above equations to be meaningful, we must set its mass to zero. From Eq. (3.122), we have for photons that hn E p c c

(3.123)

which is the same as Eq. (3.7) in Sec. 3.1.3. By noting that ln c, we obtain h lp

(3.124)

The kinetic energy of a photon is pc or hn since its rest energy is zero. One should not 1 attempt to calculate the kinetic energy of a photon by 2mc2, because photons are not only massless but also relativistic particles, for which the energy and momentum must be evaluated according to the above mentioned equations. While photons do not have mass, it has been observed that photons can be used to create particles with nonzero mass or vice versa, as in creation or annihilation reactions. High energy physics has proven that mass is not always conserved. Furthermore, energy and mass can be interconverted. A small amount of mass can be converted into a large amount of energy, as in a nuclear reaction.

96

CHAPTER 3

3.8 SUMMARY This chapter started with very basic independent particle systems to derive the three major statistics, i.e., the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac statistics. The classical and quantum statistics were then applied to thermodynamic systems, providing microscopic interpretations of the first, second, and third laws of thermodynamics, as well as Bose-Einstein condensate. The velocity distribution and specific heat of ideal gases were explained based on the semi-classical statistics, followed by a brief description of quantum mechanics to understand the quantization of translational, rotational, and vibrational modes. The fundamental emission and absorption processes of molecules or atoms were discussed along with the concept of stimulated emission. Finally, matter-energy conversion was described within the framework of the relativistic theory. While most of the explanations in this chapter are semi-classical and somewhat oversimplified, it should provide a solid background to those who did not have a formal education in statistical mechanics and quantum physics. These materials will be frequently referenced in the rest of the book.

REFERENCES 1. C. L. Tien and J.H. Lienhard, Statistical Thermodynamics, Hemisphere, New York, 1985. 2. R. E. Sonntag and G.J. van Wylen, Fundamentals of Statistical Thermodynamics, Wiley, New York, 1966. 3. J. E. Lay, Statistical Mechanics and Thermodynamics of Matter, Harper Collins Publishers, New York, 1990. 4. C. E. Hecht, Statistical Thermodynamics and Kinetic Theory, W. H. Freeman and Company, New York, 1990. 5. V. P. Carey, Statistical Thermodynamics and Microscale Thermophysics, Cambridge University Press, Cambridge, UK, 1999. 6. F. C. Chou, J. R. Lukes, X. G. Liang, K. Takahashi, and C. L. Tien, “Molecular dynamics in microscale thermophysical engineering,” Annu. Rev. Heat Transfer, 10, 144–176, 1999. 7. S. Maruyama, “Molecular Dynamics Method for Microscale Heat Transfer,” in Advances in Numerical Heat Transfer, W. J. Minkowycz and E. M. Sparrow (eds.), Vol. 2, pp. 189–226, Taylor & Francis, New York, 2000. 8. D. J. Griffiths, Introduction to Quantum Mechanics, 2nd ed., Prentice Hall, New York, 2005. 9. H. J. Metcalf and P. van der Straten, Laser Cooling and Trapping, Springer, New York, 1999. 10. G. Burns, High-Temperature Superconductivity: An Introduction, Academic Press, Boston, MA, 1992. 11. A. Einstein, “Zur quantentheorie der strahlung,” Phys. Z., 18, 121–128, 1917; English translation in Sources of Quantum Mechanics, B. L. Van der Waerden (ed.), North-Holland Publishing Company, Amsterdam, the Netherlands, 1967. 12. H. P. Baltes, “On the validity of Kirchhoff’s law of heat radiation for a body in a nonequilibrium environment,” Progress in Optics, 13, 1–25, 1976. 13. J. P. Gordon, H. J. Zeiger, and C. H. Townes, “The maser—New type of microwave amplifier, frequency standard, and spectrometer,” Phys. Rev., 99, 1264–1274, 1955; A. L. Schawlow and C. H. Townes, “Infrared and optical masers,” Phys. Rev., 112, 1940–1949, 1958. 14. R. Wolfson and J. M. Pasachoff, Physics with Modern Physics for Scientists and Engineers, 3rd ed., Addison-Wesley, Reading, MA, 1999.

PROBLEMS For a rectangular prism (i.e., a cuboid) whose three sides are x, y, and z if x y z 9, find the values of x, y, and z so that the volume of the prism is maximum.

3.1.

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

97

3.2. Make a simple computer program to evaluate the relative error of Stirling’s formula: ln x! < x ln x x for x 10, 100, and 1000. 3.3. For each of the following cases, determine the number of ways to place 25 books on 5 shelves (distinguishable by their levels). The order of books within an individual shelf is not considered. (a) The books are distinguishable, and there is no limit on how many books can be put on each shelf. (b) Same as (a), except that all the books are the same (indistinguishable). (c) The books are distinguishable, and there are 5 books on each shelf. (d) The books are distinguishable, and there are 3 books on the 1st shelf, 4 on the 2nd, 5 on the 3rd, 6 on the 4th, and 7 on the 5th. 3.4. For each of the following cases, determine the number of ways to put 4 books on 10 shelves (distinguishable by their levels). Disregard their order on each shelf. (a) The books are distinguishable, and there is no limit on how many books you can place on each shelf. (b) Same as (a), but there is a maximum of 1 book on any shelf. (c) Same as (a), except that the books are identical (indistinguishable). (d) Same as (b), except that the books are identical. 3.5. A box contains 5 red balls and 3 black balls. Two balls are picked up randomly. Determine the following: (a) What’s the probability that the second ball is red? (b) What’s the probability that both are red? (c) If the first one is black, what is the probability that the second is red? 3.6. Suppose you toss two dice, what’s the probability of getting a total number (a) equal to 5 and (b) greater than 5? 3.7. Draw 5 cards from a deck of 52 cards. (a) What is the probability of getting a royal flush? (b) What is the probability of getting a full house? [A royal flush is a hand with A, K, Q, J, and 10 of the same suit. A full house is a hand with three of one kind and two of another (a pair).] 3.8. For a Gaussian distribution function, f (x) a exp[(x m)2], where a and m are positive constants. (a) Find the normalized distribution function F(x). (b) Show that the mean value x m. (c) Determine the variance uvar and the standard deviation s. 3.9. The speed distribution function for N particles in a fixed volume is given by f (V) AV(B V)/B3, where V ( 0) is the particle speed, and A and B are positive constants. Determine: (a) The probability density function F(V). (b) The number of particles N in the volume. (c) The minimum speed Vmin and maximum speed Vmax. (d) The most probable speed where the probability density function is the largest. (e) The average speed V and the root-mean-square average speed V 2V 2. rms 3.10. Six bosons are to be placed in two energy levels, each with a degeneracy of two. Evaluate the thermodynamic probability of all arrangements. What is the most probable arrangement? 3.11. Four fermions are to be placed in two energy levels, each with a degeneracy of four. Evaluate the thermodynamic probability of each arrangement. What is the most probable arrangement? 3.12. Derive the Fermi-Dirac distribution step by step. Clearly state all assumptions. Under which condition, can it be approximated by the Maxwell-Boltzmann distribution? 3.13. What is the Boltzmann constant and how is it related to the universal gas constant? Show that the ideal gas equation can be written as P nkBT . What is the number density of air at standard conditions (1 atm and 25C)? 3.14. How many molecules are there per unit volume (number density) for a nitrogen gas at 200 K and 20 kPa? How would you estimate the molecular spacing (average distance between two adjacent molecules)? 3.15. Use Eq. (3.28a) and 1/T ('S/'U)V,N to show that b 1/kBT . 3.16. Show that b 1/kBT and a m/kBT for all the three statistics. [Hint: Follow the discussion

in Sec. 3.2 with a few more steps.] 3.17. Consider 10 indistinguishable particles in a fixed volume that obey the Bose-Einstein statistics. There are three energy levels with e0 0.5 eu, e1 1.5 eu, and e2 2.5 eu, where “eu” refers to a certain energy unit. The degeneracies are g0 1, g1 3, and g2 5, respectively.

CHAPTER 3

98

(a) If the degeneracy were not considered, in how many possible ways could you arrange the particles on the three energy levels? (b) You may notice that different arrangements may result in the same energy. For example, both the arrangement with N1 9, N2 0, N3 1 and the arrangement with N1 8, N2 2, N3 0 yield an internal energy U 7 eu. How many arrangements are there with U 9 eu? Calculate the thermodynamic probability for all macrostates with U 9 eu. (c) The ground state refers to the state corresponding to the lowest possible energy of the system. Determine the ground-state energy and entropy. What is the temperature of this system at the ground state? (d) How many microstates are there for the macrostate with U 25 eu? 3.18. Consider a system of a single type of constituents, with N particles (distinguishable from the statistical point of view) and only two energy levels e0 0 and e1 e (nondegenerate). (a) What is the total number of microstates in terms of N. How many microstates are there for the macrostate that has energy U (N 1)e? Show that the energy of the most probable macrostate is Ne/2. (b) What are the entropies of the states with U 0 and U (N 1)e. Sketch S as a function of U. Comment on the negative temperature, 1/T ('S/'U)V,N 0. Is it possible to have a system with a negative absolute temperature? 3.19. A system consists of six indistinguishable particles that obey Bose-Einstein statistics with two energy levels. The associated energies are e0 0 and e1 e, and the associated degeneracies are g0 1 and g1 3. Answer the following questions: (a) How many possible macrostates are there? How many microstates corresponding to the macrostate with three particles on each energy level? (b) What is the most probable macrostate, and what are its corresponding energy U and thermodynamic probability ? (c) Show that at 0 K, both the energy and the entropy of this system are zero. Also, show that for this system the entropy increases as the energy increases. 3.20. From the Sackur-Tetrode equation, show that s2 s1 cp ln (T2/T1) R ln (P2/P1). 3.21. Write U, p, A, and S in terms of the partition function Z. Express H and G in terms of the partition function Z. For an ideal monatomic gas, express H and G in terms of T and P. 3.22. For an ideal diatomic gas, the partition function can be written as Z Zt Zr Zv Ze ZD, where Ze ge0 is the degeneracy of the ground electronic level, and ZD exp (D0/kBT) is the chemical partition function that is associated with the reaction of formation. Here, ge0 and D0 can be regarded as constants for a given material. Contributions to the partition function beside the translation are due to internal energy storage and thus are called the internal contribution, i.e., Zint = ZrZvZeZD. Find the expressions of U, P, A, S, H, and G in terms of N, T, and P (or V) with appropriate constants, assuming that the temperature T W r and is comparable with v. 3.23. For an ideal molecular gas, derive the distribution function in terms of the kinetic energy e mv2/2, i.e., f(e). 3.24. Prove Eq. (3.48), Eq. (3.49a) and Eq. (3.50). 3.25. Evaluate and plot the Maxwell speed distribution for Ar gas at 100, 300, and 900 K. Tabulate the average speed, the most probable speed, and the rms speed at these temperatures. 3.26. A special form of the Euler-Maclaurin summation formula is `

1 1 1 (3) 1 (5) a f (j) 3 f (x)dx 2 f (a) 12 f r(a) 720 f (a) 30,240 f (a) c ja a `

Consider the rotational partition function, ` r Zr a (2j 1) exp c j( j 1) d T j0

and show that 2

Z < r

which is Eq. (3.55) for s 1.

1 r 1 r T c1

a b cd 3 T 15 T r

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

99

3.27. Because of the nuclear spin degeneracy, hydrogen H2 gas is consistent of two different types:

ortho-hydrogen and para-hydrogen. The rotational partition functions can be written, respectively, as Zr,ortho 3

(2l 1)exp c l(l 1)

a

l0,2,4c

and

Zr,para

a

(2l 1) exp c l(l 1)

l1,3,5c

so that

Zr,H2 3

a

(2l 1) exp c l(l 1)

l0,2,4c

r d

T

a

r d T

r d T

(2l 1) exp c l(l 1)

l1,3,5c

r d. T

Evaluate the temperature-dependent specific heat of each of the two types of hydrogen, which can be separated and stay separated for a long time before the equilibrium distribution is restored. Calculate the specific heat of hydrogen in the equilibrium distribution as a function of temperature. The ratio Zr,ortho/Zr,para is the same as the equilibrium ratio of the two types and varies from 0 at very low temperatures to 3 near room temperature. 3.28. Calculate the specific heat and the specific heat ratio g cp /cv for nitrogen N2 at 30, 70, 300, and 1500 K. Assume the pressure is sufficiently low for it to be an ideal gas. 3.29. Calculate the specific heat and the specific heat ratio g cp /cv for oxygen O2 at 50, 100, 300, and 2000 K. Assume the pressure is sufficiently low for it to be an ideal gas. 3.30. Estimate the mole and mass specific heats of CO gas at 100, 300, and 3000 K. Show in a specific heat versus temperature graph the contributions from different modes. 3.31. (a) How many rotational degrees of freedom are there in a CO2 molecule and in a H2O molecule? (b) If the temperature of a low-pressure CO2 gas is raised high enough to completely excite its rotational and vibrational modes, what will be its specific heats cv and cp? Express your answer in both kJ/(kg K) and kJ/(kmol K). 3.32. Compute and plot the temperature-dependent specific heat for the following ideal gases and compare your results with tabulated data or graphs: (a) CO2, (b) H2O, and (c) CH4. 3.33. Do a literature search to discuss the following topics: (a) the significance of partition functions, (b) the different types of statistical ensembles, and (c) statistical fluctuations. 3.34. We have discussed the translational degeneracy dg in a 3-D space with a volume V, as given in Eq. (3.85). Consider the situation when the particle is confined in a 2-D square potential well. Find the proper wavefunctions and the energy eigenvalues. Assuming the area A is very large, find the translational degeneracy dg in terms of A, m, e, and de. 3.35. Estimate the speed an electron needs in order to escape from the ground state of a hydrogen atom. What is the de Broglie wavelength of the electron at the initial speed? If a photon is used to knock out the electron in the ground state, what would be the wavelength of the photon? Why is it inappropriate to consider the electron movement in an atom as an analogy to the movement of Mars in the solar system? 3.36. For the harmonic oscillator problem discussed in Sec. 3.5.5. Show that Eq. (3.111) is a solution for Eq. (3.107) for n 0, 1, and 2. Plot 20, 21, and 22 to discuss the differences between classical mechanics and quantum mechanics.

This page intentionally left blank

CHAPTER 4

KINETIC THEORY AND MICRO/NANOFLUIDICS

Statistical mechanics involves determination of the most probable state and equilibrium distributions, as well as evaluation of the thermodynamic properties in the equilibrium states. Kinetic theory deals with the local average of particle properties and can be applied to nonequilibrium conditions to derive transport equations.1–7 Kinetic theory, statistical mechanics, and molecular dynamics are based on the same hypotheses; they are closely related and overlap each other in some aspects. Knowledge of kinetic theory is important to understanding gas dynamics, as well as electronic and thermal transport phenomena in solid materials. In this chapter, we first introduce the simple kinetic theory of ideal gases based on the mean-free-path approximation. While it can help us obtain the microscopic formulation of several familiar transport equations and properties, the simple kinetic theory is limited to local equilibrium and, hence, is good only for time durations much longer than the mechanistic timescale, called the relaxation time. The advanced kinetic theory is based on the Boltzmann transport equation (BTE), which will also be presented in this chapter. The BTE is an integro-differential equation of the distribution function in terms of space, velocity, and time. It takes into account changes in the distribution function caused by external forces and collisions between particles. Many macroscopic phenomenological equations, such as Fourier’s law of heat conduction, the Navier-Stokes equation for viscous flow, and the equation of radiative transfer for photons and phonons, can be derived from the BTE, under the assumption of local equilibrium. Finally, in the last section of this chapter, we present the application of kinetic theory to the flow of dilute gases in micro/nanostructures and the associated heat transfer. The application of kinetic theory to heat conduction in metals and dielectrics will be discussed in forthcoming chapters.

4.1 KINETIC DESCRIPTION OF DILUTE GASES In this section, we will introduce the simple kinetic theory of ideal molecular gases. The purpose is to provide a step-by-step learning experience leading to more advanced topics. There are several hypotheses and assumptions in kinetic theory of molecules. • Molecular hypothesis: Matter is composed of small discrete particles (molecules or atoms); any macroscopic volume contains a large number of particles. At 25°C and 1 atm, 1-m3 space of an ideal gas contains 27 million molecules.

101 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

CHAPTER 4

102

• Statistic hypothesis: Time average is often used since any macroscopic observation takes much longer than the characteristic timescale of molecular motion (such as the average time lapse between two subsequent collisions of a given molecule). • Kinetic hypothesis: Particles obey the laws of classical mechanics. • Molecular chaos: The velocity and position of a particle are uncorrelated. The velocities of any two particles are not correlated. • Ideal gas assumptions: Molecules are rigid spheres resembling billiard balls. Each molecule has a diameter d and a mass m. All collisions are elastic and conserve both energy and momentum. Molecules are widely separated in space (i.e., a dilute gas). Intermolecular forces are negligible except during molecular collisions. The duration of collision is negligible compared with the time between collisions. No collision can occur with more than two particles. The general molecular distribution function is f(r, v, t), which is a function of space, velocity, and time. The distribution function gives the particle (number) density in the phase space at any time. Therefore, the number of particles in a volume element of the phase space is dN f(r, v, t)dxdydzdvx dvy dvz f(r, v, t)dVdv

(4.1)

where we have used v for the velocity space (dv dvx dvy dvz). Integrating Eq. (4.1) over the velocity space gives the number of particles per unit volume, or the number density, as n(r, t)

dN 3 f(r, v, t)dv dV v

(4.2)

Note that the density is r(r, t) m ⋅ n (r, t), where m is the mass of a particle. The total number of particles inside the volume V as a function of time is then N(t) 6 f(r, v, t)dVdv

(4.3)

V,v

In a thermodynamic equilibrium state, (4.4)

f(r, v, t) f(v)

which is independent of space and time. Any intensive property will be the same everywhere. 4.1.1 Local Average and Flux Let c c(r, v, t) be any additive property of a single molecule, such as kinetic energy and momentum. Note that c may be a scalar or a vector. The local average or simply the average of the property c is defined as 1 c 3 f c dv^ 3 f dv n 3 f c dv

(4.5)

v

v

v

which is a function of r and t. The ensemble average is the average over the phase space, i.e., c

1 f c dV dv N6 V,v

For a uniform gas, the local average and the ensemble average are the same.

(4.6)

KINETIC THEORY AND MICRO/NANOFLUIDICS

103

The transfer of c across an area element dA per unit time per unit area is called the flux of c. As shown in Fig. 4.1, particles having velocities between v and v dv that will pass

FIGURE 4.1 Illustration of the flux of particles and quantities through a surface.

through the area dA in the time interval dt must be contained in the inclined cylinder, whose volume is dV vdt cosu dA v ⋅ n dAdt. It is assumed that dt is sufficiently small such that particle-particle collisions can be neglected. The number of particles with velocities between v and v dv within the inclined cylinder can be calculated by (4.7)

f(r, v, t)dVdv f(r, v, t) v # n dAdtdv The flux of the property c is then cf(r, v, t) v # n dvdAdt dAdt Integrating over all velocities yields the total flux of c: flux of c within dv

Jc 3 c f v # n dv

(4.8)

v

Equation (4.8) gives the net flux since it is evaluated for all ., or over a solid angle of 4p in the spherical coordinates. Very often the integration is performed over the hemisphere with v n v cosu 0 for positive flux or v ⋅ n v cosu 0 for negative flux. When c 1, Eq. (4.8) gives the particle flux: JN 3 f v # n d.

(4.9)

.

In an equilibrium state, this integration can be evaluated using the spherical coordinates. Noting that v ⋅ n v cosu and f f(v), which is independent of the direction (isotropic), we can obtain the particle flux in the positive z direction by integrating over the hemisphere in the velocity space, i.e., `

2p p/2

`

JN 3 3 3 f(v)v3 cos u sin ududfdv p 3 f(v)v3dv v0 f0 u0

0

(4.10)

CHAPTER 4

104

In writing Eq. (4.10), we have kept the vector variable in f(v) to signify that it is a velocity distribution. One should bear in mind that the last expression is based on the fact that f(v) is not a function of u and f. For an ideal molecular gas, f(v) is given by the Maxwell velocity distribution, i.e., Eq. (3.43) in Chap. 3. If the integration in Eq. (4.10) is performed over the whole sphere with u from 0 to p, we would obtain the net flux of particles, which is zero in the equilibrium case. The average speed can be evaluated using Eq. (4.5); hence, `

4p 1 1 v n 3 f(v)vdv n 9 f(v)v3 sin ududfdv n 3 f(v)v3dv 0

v

(4.11)

v,f,u

Here, we have assumed an isotropic distribution function to obtain the last expression. The above equation is evaluated over the solid angle of 4p to obtain the average of all velocities. Comparing Eq. (4.10) and Eq. (4.11), we can see that nv (4.12a) JN 4 For an ideal gas, since f(v) is given by the Maxwell velocity distribution, Eq. (3.44), we obtain JN

kBT nv n Å 4 2pm

(4.12b)

Because each particle has the same mass, the mass flux is given by rv Jm m 3 f v # n dv 4 v

(4.13)

Substituting c mv2/2 into Eq. (4.8), one obtains the kinetic energy flux JKE. In an equilibrium state with an isotropic distribution, the kinetic energy flux in the positive z direction ` is JKE (pm/2) 10 f (v)v5dv, whereas the net kinetic energy flux is zero. Note that Eq. (4.8) is a general equation that is also applicable to nonequilibrium and anisotropic distributions. When c mv, the momentum flux is a vector, which is often handled by considering individual components. Note that the rate of transfer of momentum across a unit area is equal to the force that the area must exert upon the gas to sustain the equilibrium. Furthermore, the surface may be projected to three orientations, yielding a nine-component tensor in the momentum flux: Pij 3 (mvj)fvi dv ,

i, j 1, 2, 3

(4.14a)

v

Here, (v1,v2,v3) and (vx,vy,vz) are used interchangeably. Let P rv2i , which is always positive, and tij rvjvi for i ⬆ j and 0 for i j. We can rewrite the above equation as Pij nmvjvi rvjvi Pdij tij

(4.14b)

where dij is the Kronecker delta, which is equal to 1 when i j and 0 when i 2 j. It can be seen that P is the normal stress or static pressure and tij (i 2 j) is the shear stress, which is zero in a uniform, stationary gas (without bulk motion). Notice that the velocity distribution in the

KINETIC THEORY AND MICRO/NANOFLUIDICS

105

vicinity of the wall is the same as that away from the wall because of the reflection by the wall. The pressure is now related to the momentum flux, i.e., 3P r(v2x v2y v2z ) rv2, or 1 2 P r 3v

(4.15)

which is Boyle’s law. Compared with the ideal gas equation, the right-hand side must be related to temperature. In kinetic theory, temperature is associated to the mean translational kinetic energy of the molecule, i.e., 3 1 1 1 1 k T mv2 mv2x mv2y mv2z 2 B 2 2 2 2

(4.16)

We have derived this equation from statistical mechanics in Chap. 3. The temperature defined based on the kinetic energy of the particles is sometimes referred to as the kinetic temperature. Combining Eq. (4.15) and Eq. (4.16), we get the ideal gas equation, P nkBT , as expected. From the above discussion, one can see clearly how the macroscopic properties such as pressure and temperature are related to the particle distribution function. For ideal gases at equilibrium, we have derived the Maxwell velocity and speed distributions in Chap. 3. Example 4-1. Show that P rv2n, where vn is the velocity component normal to the wall, and P rv2/3 for equilibrium distribution. Solution. Consider the horizontal plane shown in Fig. 4.1 as the wall, below which is a gas in equilibrium. Multiplying Eq. (4.7) by mv gives the momentum of the particles with velocities between v and v dv, impinging on the wall: mv f(v)v # n dAdtdv , which of course is equal to the impulse on the wall: dFdt. The normal component vn v # n v cos u contributes to an impulse on the wall: mv2n f(v)dAdtd., that is always positive regardless of the sign of vn. However, the contributions of all parallel components cancel out due to isotropy. The pressure can be evaluated by integrating over all velocities, i.e., P 1. mv2n f(v) d. mnv2n rv2n. We have used the definition of local average given by Eq. (4.5). If the distribution is isotropic, then ` ` 2p p P m 10 10 10 f(v)v4cos2 u sinududfdv (4pm/3) 10 f(v)v4 dv since vn v cos u. Compared ` 2p p ` with v2 (1/n) 10 10 10 f(v)v 4sinu dudfdv (4p/n) 10 f(v)v4 dv, we obtain P mnv2/3 rv2/3. The distribution function is uniform inside the container; hence, the wall may be a physical wall or merely an imaginary one since pressure exists everywhere in the fluid.

4.1.2 The Mean Free Path The mean free path, defined as the average distance the particle travels between two subsequent collisions, is a very important concept. It is often used to determine whether a given phenomenon belongs to the macroscale (continuum) regime or otherwise falls in the microscale regime when the governing equations derived under the assumption of local equilibrium break down. One of the applications is in microfluidics, to be discussed later in this chapter, and another is in the electrical and heat conduction in solids, which will be studied in Chap. 6. Consider the case in Fig. 4.2: a particle of diameter d moving at an average velocity v (assuming all other particles are at rest). During a time interval dt, the volume swept by the particle within d from the centerline is dV pd 2vdt. The ndV particles, whose centers are inside this volume element, will collide with the moving particle. Therefore, the frequency of collisions, i.e., number of collisions per unit time is pnd 2v. The time between two subsequent collisions, t, is the inverse of the frequency of collision. The mean free path is

CHAPTER 4

106

FIGURE 4.2 Schematic used for a simple derivation of the mean free path.

the average distance that a particle travels between two subsequent collisions and is equal to the ratio of the average velocity to the frequency of collision. Therefore, vt < (pnd 2)1

(4.17)

and depends only on the particle size and the number density. The average time between two subsequent collisions t is termed the relaxation time, and the average frequency of collision t1 is the scattering rate or collision rate. The scattering rate is the average number of collisions an individual particle experiences per unit time. For electrons whose diameters are negligible compared with that of the other particles that scatter them, the mean free path is electron(or photon)

1 nAc

(4.18)

where Ac is the scattering cross-sectional area and n is the number density of the scatter, such as phonons or defects. Equation (4.18) also applies for the case of photons that can be scattered by particles, such as molecules in the atmosphere. The photon mean free path is also called the radiation penetration depth, as will be discussed in Chap. 8. When the relative movement of particles is considered based on the Maxwell velocity distribution, Eq. (4.17) is modified slightly for an ideal gas as follows:

23

(7.7)

Equation (7.7) relates the speed of the temperature wave to the speed of sound in an insulator. The square root of three can be understood as due to the randomness of thermal fluctuations in a 3-D medium, just like the relation between the velocity and its components, v2 v2x v2y v2z , in kinetic theory. Equation (7.5) indeed sets a limit on the heat propagation speed, which is manifested by a sharp wavefront that travels at vtw inside the medium for a sudden temperature change at the boundary. As a wave equation, the solution is a temperature field with both an amplitude and a phase. Theoretically, the temperature wave can be reflected by another boundary and can interfere, constructively or destructively, with a forward propagating wave. The interaction between the temperature waves may also result in a resonance effect, a typical wave phenomenon. Numerous analytical and numerical predictions have been made, as referenced in the work of Özi¸sik and Tzou,4 along with Yeung and Lam,5 Haji-Sheikh et al.,6 and Gembarovic and Gembarovic, Jr.7 It should be noted that the terms heat wave3 and thermal wave4 have also been frequently used in the literature to describe the temperature wave behavior. The term “temperature wave” is used in this chapter for the wavelike behavior associated with the hyperbolic-type heat equations, because “heat wave” might be confused with the calamitous weather phenomenon and “thermal wave” might be confused with the diffusion wave used in photoacoustic techniques. Bennett and Patty (Appl. Opt., 21, 49, 1982) clarified: The term thermal wave interference is used to mean the superposition of simple harmonic solutions of the thermal diffusion equation. Although wavelike in nature there are important differences between thermal waves arising from a differential equation that is of the first order in time and waves that are solution to a wave equation that is of the second order in time. In the heat transfer literature, thermal wave often refers to periodic-heating techniques used widely for thermophysical property measurements. Let us consider an example of a semi-infinite solid under a constant heat flux at the surface. Figure 7.2 illustrates the solutions for a small tp and a large tp, compared with t. Here again, we have assumed tq t. The propagation speed is equal to vtw, and the pulse wavefront is given by x1 vtwt1 and x2 vtwt2. Hence, x1 x2 , where vgt is the mean free path. In the case of a short pulse, the temperature pulse propagates and its height decays by dissipating its energy to the medium as it travels. The parabolic heat equation, on the other hand, predicts a continuous temperature distribution without any wavefront (see Fig. 7.2). As time passes on, the first-order time derivative, or the diffusion term, in Eq. (7.5) dominates. If the relative change of 'T/'t or qs during one tq is large, then the wave feature is important. This should happen immediately after a sudden thermal disturbance that results in a temporal nonequilibrium, as well as a spatial nonequilibrium near the heat pulse or the wavefront. After a sufficiently long time, usually 5 to10 times tq, a local equilibrium will be reestablished, and the thermal field can be described by the parabolic heat equation. At steady state, the hyperbolic and parabolic equations predict the same results. While Eq. (7.4) is mathematically more general than the heat diffusion equation, it should not be taken as

CHAPTER 7

252

FIGURE 7.2 (Not to scale) Illustration of the solution of the hyperbolic heat equation at short timescales. (a) A short pulse, tp V t . (b) A long pulse, tp t . The solid curves are the solutions of the hyperbolic heat equation, Eq. (7.5), and the dash-dotted and dashed curves are the solutions, calculated from Eq. (7.1), obtained from the heat diffusion equation.

a correction, or a more realistic theory than the Fourier conduction model, because the Cattaneo equation has not been justified on a fundamental basis, nor has it been validated by any plausible experiments. Many researchers have investigated the hyperbolic heat equation based on the second law of thermodynamics.8–10 It has been found that the hyperbolic heat equation sometimes predicts a negative entropy generation and even allows energy to be transferred from a lower-temperature region to a higher-temperature region. The entropy generation rate for heat conduction without an internal source can be calculated by10 'qs . 1 1 sgen 2 qs # =T 2 qs # aqs tq b 't T T

(7.8)

The above equation was obtained by setting the energy and entropy balances as follows: r

'u = # qs 't

qs 's . and r 't = # a T b sgen

(7.9)

Note that du Tds. A negative entropy generation can easily be numerically demonstrated from Eq. (7.5) during the temperature wave propagation. Here, a negative entropy generation does not constitute a violation of the second law of thermodynamics because the concept of “temperature” in the hyperbolic heat equation cannot be interpreted in the conventional sense due to the lack of local thermal equilibrium. Extended irreversible thermodynamics has been proposed by Jou et al. by modifying the definition of entropy such that it is not a property of the system anymore but depends on the heat flux vector.11 The theory of extended irreversible thermodynamics is self-consistent but has not been experimentally validated; hence, it cannot be taken as a generalized thermodynamic theory. Similarly, the hyperbolic heat equation should not be treated as a more general theory over Fourier’s heat conduction theory. Derive the modified Fourier equation, or the Cattaneo equation, based on the BTE under the relaxation time approximation.

Example 7-2.

Solution. Tavernier (C. R. Acad. Sci., 254, 69, 1962) first showed that the Cattaneo equation could be derived for phonons and electrons using the relaxation time approximation of the BTE. Let us first review Sec. 4.3.2, where we have derived Fourier’s law based on the BTE. Again, let us start

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

253

by assuming that the temperature gradient is in the x direction only. The transient 1-D BTE under the relaxation time approximation can be written as follows: f0 f 'f 'f

vx t 't 'x A further assumption is made such that 'f /'x < 'f0 /'x ('f0 /'T )('T/'x), which is exactly the condition of local equilibrium. Multiplying the earlier equation by tevx and then integrating each term over the momentum space, we obtain by noting 1. evx f0 d. 0 that

or

'f0 'T ' ed. 3 vx fed. tv fed. 3 tv2x 't 3. x 'T 'x . .

(7.10a)

'qsx 'T k qsx 't 'x

(7.10b)

t

which can be generalized to the 3-D case as given in Eq. (7.2), after replacing t with tq.

The derivation given in this example, however, does not provide a microscopic justification of the hyperbolic heat equation, because it is strictly valid only under the local-equilibrium assumption with an averaged relaxation time. The local-equilibrium assumption prohibits application of the derived equation to length scales comparable or smaller than the mean free path.12 Suppose a thermal disturbance occurs at a certain time and location; after a duration of time that is much longer than the relaxation time, the Fourier law and the parabolic heat equation are well justified because both the spatial and temporal local-equilibrium conditions are met. On the other hand, if we wish to use the modified Fourier equation to study the transient behavior at a timescale less than t, then the disturbance will propagate by a distance shorter than the mean free path, as shown in Fig. 7.2. Therefore, the derivation based on the BTE, under local-equilibrium and relaxation time approximations, is not a microscopic proof of the hyperbolic heat equation, which is meaningful only in a nonequilibrium situation. To this end, it appears that Maxwell made the right choice in dropping terms involving the relaxation time in the paper (Phil. Trans. R. Soc. London, 157, 49, 1867), by assessing that the rate of conduction will rapidly establish itself. While the previous derivation does not support Eq. (7.2), it does not disprove Eq. (7.2) either because the relaxation time approximation is not a very good model in the nonequilibrium regime. The local-equilibrium assumption breaks down completely at extremely short timescales. The basic assumption in the relaxation time approximation is that the distribution function is not too far from equilibrium. For a heat pulse with a duration less than t, the relaxation time approximation should generally be applied when t t, regardless of whether we are dealing with a thin film or a semi-infinite medium. What may be concluded is that we have failed to prove either by any fundamental theory or by any credible experiments that the Cattaneo equation, originated from the kinetic theory according to the relaxation time approximation, is a physical law that extends Fourier’s law to the nonequilibrium regime. Atomistic simulations, based on molecular dynamics and the lattice Boltzmann method, have provided further evidence that the hyperbolic heat equation is not applicable at very short timescales or in the nonequilibrium regime, where the applicability of the relaxation time approximation is also questionable.13,14 For this reason, we have intentionally avoided phrases like “generalized Fourier’s equation” and “modified Fourier’s law” in describing Eq. (7.2). One might argue that when tq was identified as the average time t between collisions, under the relaxation time approximation, Eq. (7.7) could give the appropriate heat propagation speed, which is one-third of the speed of sound, as observed in liquid helium and some solids at low temperatures. This is a misinterpretation because the phenomenon, related to the second sound with a characteristic speed v2nd vg > !3, cannot occur by a single relaxation mechanism, as will be shown later. Nevertheless, after some modifications, there

CHAPTER 7

254

exist special cases when the modified heat equation becomes physically plausible and practically applicable. The modified equation does not produce sharp wavefronts like those illustrated in Fig. 7.2.

7.1.2 Dual-Phase-Lag Model Chester (Phys. Rev., 131, 2013, 1963) first explained the lagging behavior associated with the Cattaneo equation. He pointed out that the physical significance of the modified Fourier equation lies in that there exists a finite buildup time after a temperature gradient is imposed on the specimen for the onset of a heat flow, which does not start instantaneously but rather grows gradually during the initial period on the order of the relaxation time t. Conversely, if the thermal gradient is suddenly removed, there will be a lag in the disappearance of the heat current. Gurtin and Pipkin (Arch. Ration. Mech. Anal., 31, 113, 1968) introduced the memory effect to account for the delay of the heat flux with respect to the temperature gradient. They expressed the heat flux as an integration of the temperature gradient over time, in analogy with the stress-strain relationship of viscoelastic materials with instantaneous elasticity. The linearized constitutional equation reads qs(r, t) 3

t

K(t tr)=T(r, tr)dtr

(7.11)

`

where K(j) is a kernel function. When K(j) kd(j), Eq. (7.11) reduces to Fourier’s law; when K(j) (k/tq)ej/t, Eq. (7.11) reduces to the Cattaneo equation. By assuming k1 (7.12) K(j) k0d(j) t ej/t q Joseph and Preziosi showed that the heat flux can be separated into two parts:3 k1 t t tr qs(r, t) k0=T t 3 exp a t b=T(r, tr)dtr q q

(7.13a)

`

Hence,

qs tq

'qs ' k=T tqk0 =T 't 't

(7.13b)

where k k0 k1 is the steady-state thermal conductivity, as can be seen from Eq. (7.13a). Combined with Eq. (7.3), the heat equation becomes a partial differential equation of the Jeffreys type, tq '2T ' 1 'T (7.14) =2T tT =2T a

a 2 't 't 't where tT tqk0 /k is known as the retardation time.3 The Jeffreys equation was originally developed in the early twentieth century to relate deformation with stress in the earth’s mantle. Unless tT 0 or k0 0, Eq. (7.14) maintains the diffusive feature and produces an instantaneous response, albeit small, throughout the medium for an arbitrary thermal disturbance. In a series of papers published in the early 1990s, Tzou extended the lagging concept to a dual-phase-lag model, as described in his monograph published in 1997.15 He started with the assumption that qs(r, t tq) k=T(r, t tT)

(7.15)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

255

The introduction of a delay time tT in Eq. (7.15) implies the existence of a lag in the temperature gradient, with respect to the heat flux driven by an internal or external heat source. The rationale of the phenomenological equation given in Eq. (7.15) was that, in some cases, the heat flux might be viewed as the result of a preceding temperature gradient; in other cases, the temperature gradient might be viewed as the result of a preceding heat flux. The heat flux and the temperature gradient can switch roles in the relationship between “cause” and “effect.” Moreover, both lags might occur simultaneously in certain materials under dramatic thermal disturbances, such as during short-pulse laser heating.4,15 These primitive arguments should not be scrutinized rigorously; rather, they are merely thinking instruments to help us gain an intuitive understanding of the heat flux and temperature gradient relationship. After applying the Taylor expansion to both sides of Eq. (7.15) and using the first-order approximation, one immediately obtains qs tq

'qs ' k=T tT k =T 't 't

(7.16)

which is mathematically identical to Eq. (7.13b), with the substitution of tqk0 tT k. Applying the first-order approximation of Eq. (7.15), one may end up with qs (tq tT) 1 ('qs/'t) k =T, or qs k =T (tT tq)'( =T )/'t, or even qs Atq 3 t T B 2 ('qs/'t) k=T 3 tT k'(=T )/'t. These equations are merely special cases of Eq. (7.16), after regrouping tq and tT. The only requirement for Eq. (7.16) to make logical sense is that both tq and tT are nonnegative. The reason that a lag in time has been called a phase lag is perhaps because the temperature field can be viewed as a Fourier transform: ` ~ ~ T(r, t) 1` T(r,v)eivtdv, where T (r,v) is the Fourier component at frequency v. The actual phase lag vtT (or vtq for heat flux) depends on the frequency. Equation (7.16) is mathematically more general and has some advantages over the Cattaneo equation. From now on, Eq. (7.14) will be called the lagging heat equation. It is straightforward to include the source terms in the lagging heat equation, as well as to treat thermophysical properties as temperature dependent. The solution, however, becomes more and more difficult as the complexity increases. Numerous studies have appeared in the literature on analytical solutions and numerical methods.4,15–18 It should be noted that in Eq. (7.12), k0 and k1 denote the effective and elastic conductivities, respectively, and are supposed to be nonnegative.3 Therefore, tT must not be greater than tq. In fact, the ratio k0/(k0 k1) is a direct indication of whether thermal behavior can be described by heat diffusion (when 1) or the hyperbolic heat equation (when 0). In general, 0 1, and the thermal process lies somewhere between the two extremes prescribed by Fourier’s law and the Cattaneo equation. In other words, there will be wavelike features in the solution, which is superimposed by an instantaneous diffusive response throughout the medium. The diffusive response here, as well as in Fourier’s law, does not correspond to an infinite speed of propagation. Rather, it is well justified by quantum statistics as explained previously. The dual-phase-lag model relaxes the requirement of tT tq; but in the mean time, it produces a negative thermal conductivity component, i.e., k1 0, according to Eq. (7.12). This drawback has long been overcome by Tzou, who proposed a new memory function in accordance with Eq. (7.16) as follows:15 k qs(r, t) t 3 q

t

`

t tr ' expa t b c=T(r, tr) tT =T(r, tr)d dtr q 'tr

(7.17)

Equation (7.17) suggests that the heat flux depends not only on the history of the temperature gradient but also on the history of the time derivative of =T . When tT 0, Eq. (7.17) becomes the Cattaneo equation. When tT tq, Eq. (7.17) reduces to Fourier’s law. However, tT tq is theoretically permitted because Eq. (7.17) does not presume that the

CHAPTER 7

256

thermal conductivity is composed of an effective conductivity and an elastic conductivity. The inclusion of tT tq makes Eq. (7.16) more general than the original Jeffreys-type equation, which is based on Eq. (7.13a). The extension to the region tT tq enables the lagging heat equation to describe the behavior of parallel heat conduction, which can occur in a number of engineering situations. Sometimes, a microscale phenomenon can be understood easily if a macroscale analog can be drawn. For this reason, let us consider the solid-fluid heat exchanger shown in Fig. 7.3.

Qf(x)

Qconv

Qf(x + dx) Qs(x + dx)

Qs(x) x

x + dx

Side view

x

FIGURE 7.3 Illustration of heat transfer in a solid-fluid heat exchanger, where long solid rods are immersed in a fluid inside a sealed pipe, which is insulated from the outside.

Assume that a fluid is stationary inside a sealed pipe, filled with long solid rods. The pipe is insulated from the outside. If the rods are sufficiently thin, we may use the average temperature in a cross section and assume that heat transfer takes place along the x direction only. Let us denote the temperatures of the solid rods and the fluid by Ts(x,t) and Tf(x,t), respectively, and take their properties ks, Cs (rcp)s, kf, and Cf (rcp)f to be constant. Note that Cs and Cf are the volumetric heat capacities. Given the rod diameter d, the number of rods N, and the inner diameter D of the pipe, the total surface area per unit length is P NpD, and the total cross-sectional areas of the rods and the fluid are Ac Npd 2/4 and Af (p/4)(D2 Nd 2), respectively. Assume the average convection coefficient is h. The energy balance equations can be obtained using the control volume analysis as follows: Cs and

'2Ts 'Ts ks 2 G(Ts Tf) 't 'x

(7.18a)

'Tf G(Ts Tf) 't

(7.18b)

Crf

where G hP/Ac and Crf Cf Af /Ac. In writing Eq. (7.18b), we have assumed that kf V ks and dropped the term kf ('2Tf /'x2). Equations (7.18a) and (7.18b) are coupled equations that can be solved for the prescribed initial and boundary conditions. These are completely macroscopic equations governed by Fourier’s law of heat conduction. Nevertheless, we can combine Eq. (7.18a) and Eq. (7.18b) to eliminate Tf and, consequently, obtain the following differential equation for Ts: '2Ts 2

'x

tT

2 tq '2Ts ' ' Ts 1 'Ts a 2b a

a 2 't 'x 't 't

(7.18c)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

257

where a ks/(Cs Crf), tT Crf /G, and tq CstT /(Cs Crf) tT. The same equation can also be obtained for the fluid temperature Tf. Here, tq does not have the meaning of relaxation time, and the solutions of Eq. (7.18) exhibit diffusion characteristics. Equation (7.18c) is completely physical but should not be viewed as a wave equation; rather, it describes a parallel or coupled heat diffusion process. The concept of dual phase lag can still be applied. It should be noted that, due to the initial temperature difference between the rod and the fluid, a local equilibrium is not established at any x inside the pipe until after a sufficiently long time. Although no fundamental physics can be gained from this example, it can help us appreciate that the lagging heat equation may be useful for describing the behavior in inhomogeneous media. Minkowycz et al. studied the heat transfer in porous media by considering the departure from local thermal equilibrium and obtained higher-order differential equations similar to Eq. (7.18c).19 On the other hand, Kaminski made an experimental attempt to determine tq in the hyperbolic heat equation, by measuring the time interval between when the heat source was turned on and when a temperature signal was detected.22 The heat source and the thermometer used were long needles, placed in parallel and separated by a gap of 5 to 20 mm. What the experiment actually measured was the average thermal diffusion speed vdif if the cylindrical geometry and the initial conditions were properly taken into consideration in the analysis. The main problem with this frequently cited paper and similar studies in the 1990s was that most researchers did not realize that the hyperbolic heat equation is physically unjustified to be superior to the parabolic heat equation; instead, they thought that the parabolic equation was only a special case of the more general hyperbolic equation. It appears that the Cattaneo equation and the associated hyperbolic heat equation are unlikely to be able to characterize any heat transfer problems successfully without additional terms. Many researchers have already expressed doubt about the applicability of the hyperbolic heat equation, though not so many have realized that an instantaneous response is a legitimate property, rather than a drawback of the diffusion equation. Electron gas and phonon gas in solids are quantum mechanical particles, which do not have memory of any kind. Ideal molecular gases obey classical statistics and do not have memory either, unless the deposited energy is too intense to cause ionization or reaction. Does the temperature wave exist? What is a temperature wave anyway? In the early 1940s, Russian theoretical physicist Lev Landau (1908–1968) used a two-fluid model to study the behavior of quasiparticles in superfluid helium II and predicted the existence of a second sound, propagating at a speed between vg /!3 and vg, depending on the temperature. Note that the group velocity is the same as the phase velocity for a linear dispersion. Above the l-point, where superfluidity is lost, the second sound should also disappear. Landau was awarded the Nobel Prize in Physics in 1962 for his pioneering theories of condensed matter at low temperatures. He authored with his students a famous book series in mechanics and physics. Landau’s prediction was validated experimentally (J. Phys. USSR, 8, 381, 1944) by Peshkov, who further postulated the existence of a second sound in crystals, when scattering by defects becomes minimized. It was not until the mid 1960s that the second sound associated with heat pulse propagation was observed in solid helium (below 1 K) and other crystals at low temperatures (below 20 K). The second sound can occur only at very low temperatures when the mean free path of phonons in the U processes, in which the total momentum is not conserved, is longer than the specimen size; while at the same time, the scattering rate of the N processes, in which the total momentum is conserved, is high enough to dominate other scattering processes. It should be noted that while the N processes have a much shorter mean free path than the size of the specimen, scattering by N processes does not dissipate heat (see Sec. 6.5.3). Callaway simplified the BTE for phonon systems by a two-relaxation-time approximation, which should be applicable when t tN: f0 f f1 f 'f 'f

v# t t 't 'r N

(7.19)

CHAPTER 7

258

where t stands for the relaxation time for the U processes, tN is the relaxation time for the N processes, and f0 and f1 are the associated equilibrium distribution functions.21 Guyer and Krumhansl solved the linearized BTE and derived the following equation for the phonon effective temperature: =2T

9tN ' 3 'T 3 '2T =2T 2

2 2 5 't tva 't va 't

(7.20)

where va is the average phonon speed.22 Assuming a linear dispersion, it can be evaluated using Eq. (5.10). Substituting a tv2a/3, tq t, and tT 9tN/5, we see that Eq. (7.20) is identical to Eq. (7.14). The condition t tN can be satisfied even at t t since tN V t. The significance of Eq. (7.20) lies in that the temperature wave or the second sound is not universal, but rather, requires strict conditions to be met.22 When the condition tN V t is satisfied, we have tT V tq and the energy transfer is dominated by wave propagation. At higher temperatures, the scattering rate for the U processes is usually very high, and the N processes contribute little to the heat conduction or thermal resistance, as discussed in Chap. 6. Therefore, the reason why temperature waves have not been observed in insulators at room temperature is not because of the small t, in the range from 1010 to 1013 s, but because of the lack of mechanisms required for a second sound to occur. No experiments have ever shown a second sound in metals, as suggested by the hyperbolic heat equation. Recently, Shiomi and Maruyama performed molecular dynamics simulations of the heat conduction through (5,5) single-walled carbon nanotubes, 25 nm in length, for several femtoseconds.23 They found that the wavelike behavior could be fitted by the lagging heat equation, but could not be described by the hyperbolic heat equation due to local diffusion. The ballistic nature of heat propagation in nanotubes has already been explained in Chap. 5. They suspected that optical phonons might play a major role in the non-Fourier conduction process.23 Tsai and MacDonald studied the strong anharmonic effects at high temperature and pressure using molecular dynamics.24 Their work predicted a second sound response. The coupling of elastic and thermal effects was thought to be important. Studies on thermomechanical effects such as thermal expansion, thermoelasticity, and shock waves can be found from Tzou15 and Wang and Xu,25 and will not be discussed further. Tang and Araki clearly delineated four regimes in the lagging heat equation, according to the ratio tT /tq.17 (1) When 0, it is a damped wave, i.e., hyperbolic heat conduction. (2) When 0 1, it is wavelike diffusion, for which wave features can be clearly seen if V 1. (3) When 1, it is pure diffusion or diffusion, i.e., Fourier’s conduction. (4) When 1, it is called over-diffusion, which makes the dimensionless temperature decay faster than pure diffusion would. In the next section, we will discuss a microscopic theory on short-pulse laser heating of metals, which falls in the regime of overdiffusion, or parallel conduction.

7.1.3 Two-Temperature Model With a short laser pulse, 5 fs to 500 ps, free electrons absorb radiation energy and the absorbed energy excites the electrons to higher energy levels. The “hot electrons” move around randomly and dissipate heat mainly through electron-phonon interactions. Following the work of Kaganov et al. (Sov. Phys. JETP, 4, 173, 1957), Anisimov proposed a two-temperature model, which is a pair of coupled nonlinear equations governing the effective temperatures of electrons and phonons.26 This model was experimentally confirmed later by Fujimoto et al. (Phys. Rev. Lett., 53, 1837, 1984) and Brorson et al. (Phys. Rev. Lett., 59, 1962, 1987). The two-temperature model was introduced to the heat transfer community by Qiu and Tien, who also analyzed the size effect due to boundary scattering and performed experiments with thin metallic films.27 In the two-temperature model, it was

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

259

assumed that the electron and phonon systems are each at their own local equilibrium, but not in mutual equilibrium. The electron temperature could be much higher than the lattice (or phonon) temperature due to absorption of pulse heating. Therefore, Ce

'Te . = # (k=Te) G(Te Ts) qa 't

(7.21a)

Cs

'Ts G(Te Ts) 't

(7.21b)

Here, the subscripts e and s are for the electron and phonon systems, respectively, C is the . volumetric heat capacity, G is the electron-phonon coupling constant, and qa is the source term that represents the absorbed energy rate per unit volume during the laser pulse and drops to zero after the pulse. Heat conduction by phonons is neglected, and thus, the subscript e is dropped in the thermal conductivity k. Note that qs k=Te, according to Fourier’s law. We have already given a macroscopic example of parallel heat transfer, as shown in Fig. 7.3, which should ease the understanding of the phenomenological relations given in Eq. (7.21). Equation (7.21) originates from microscopic interactions between photons, electrons, and phonons. In order to examine the parameters in Eq. (7.21) and their dependence on Te and Ts, let us assume that the lattice temperature is near or above the Debye temperature, for simplicity. In such a case, electron-electron scattering and electrondefects scattering are insignificant compared with electron-phonon scattering. It is expected that the electron relaxation time is inversely proportional to the lattice temperature, i.e., t < te-ph~T1 s . The meaning of the relaxation time is that the electron system can be assumed to be at internal local equilibrium when t t, which is the condition for Eq. (7.21) to be applicable. Boundary scattering may play a role for very thin films or in polycrystalline materials. An effective mean free path can be introduced to modify the scattering rate.27–29 The volumetric heat capacity for the lattice or phonons, Cs rcp, is a weak function of the lattice temperature; the volumetric heat capacity of electrons, from Eq. (5.25), becomes Ce

p2nek2B T gs Te 2mF e

(7.22)

Recall that Ce is relatively small compared with Cs, even at several thousand kelvins. From the simple kinetic theory, the thermal conductivity is k

keq p2nek2B T tTe < Ts e 3me

(7.23)

where keq is the thermal conductivity when Te Ts, which can be set as the room temperature value. The term Te in Eq. (7.23) comes from the heat capacity. The size effect can be included using an effective relaxation time. Theoretically, the coupling constant can be estimated by G

p2menev2a 6tTs

or G

p4(nevakB)2 18keq

(7.24)

which is independent of temperature, when boundary scattering is not important but proportional to the square of the speed of sound in the metal. With the speed of sound in the low-frequency limit, the dispersion is linear; thus, we do not have to worry about the difference between the phase velocity and the group velocity. From Eq. (5.10), we have va

kBD 4p 1/3 a b 3na h

(7.25)

CHAPTER 7

260

When boundary scattering is included, G is expected to increase from the bulk value and depend on the lattice temperature. Using the Debye temperature and for na ne, we have G

nek2B2D nek2B2D < 0.518 3 tTs m 0 12 24 tTs m 0 p2

(7.26)

Typical values of G are on the order of 1016 W/(m3 K), e.g., G < 2.9 1016 W/(m3 # K) for gold. The behavior of the electron and phonon temperatures near the surface is shown in Fig. 7.4, for a short pulse. The electron temperature rises quickly during the pulse and

Pump beam

Probe beam

Thermoreflectance signal

Te

Specimen qa

tp

Ts

t

FIGURE 7.4 Illustration (not to scale) of ultrafast thermoreflectance experiments and the associated electron and phonon temperatures near the surface, during a short pulse.

begins to decrease afterward; in the mean time, the lattice temperature gradually increases until the electron and lattice systems reach a thermal equilibrium. Both the temperatures will go down as heat is carried away from the surface. Note that the electron temperature can rise very high due to its small heat capacity, but the lattice or solid may be just slightly above room temperature. If the temperatures of electron and lattice were assumed the same, Eq. (7.21) reduces to the simple Fourier heat conduction equation, which in turn predicts a much lower temperature rise, because the heat capacity of the lattice is much higher than that of the electrons. Given such a short timescale and the nonequilibrium nature between electrons and phonons locally, no contact thermometer could possibly measure the effective electron temperature. Experiments are usually performed by the femtosecond or picosecond thermoreflectance technique, also known as the pump-and-probe method, shown in the inset of Fig. 7.4. The reflectance of the surface depends on the electron temperature Te. The experimental setup is rather involved and cannot measure the temperature distribution inside the material. The procedure is to send a pump pulse train that is synchronized with a probe pulse train at a fixed delay time. The electron temperature change near the surface is deduced by comparison of the reflectance measurements at different delay times. Electronphonon coupling, boundary scattering, and thermal boundary resistance can all affect the thermoreflectance signal. Comparing with the model described in Eq. (7.21), along with the

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

261

dependence of the reflectance on the electron temperature, the microscopic characteristics can be analyzed. Ultrafast thermoreflectance techniques have become an important thermal metrology tool for the study of electron-phonon interactions, TBR, and thermophysical properties.30,31 Thermionic emission can also occur from the surface, especially when the electrons are excited to higher energy states.32 Similarly to what has been done for Eq. (7.18), Eq. (7.21a) and Eq. (7.21b) can be combined to formulate partial differential equations for either the electron or phonon temperature. Neglecting the temperature dependence of the parameters, one obtains the following differential equations for the electron temperature and the phonon temperature, respectively, . . tq '2Te qa tT 'qa ' 2 1 'Te = Te tT = Te k k a

a 't 't 't 't 2 . tq '2Ts qa ' 2 1 'Ts 2 = Ts tT = Ts k a

a 't 't 't 2 2

(7.27a)

(7.27b)

where a k/(Ce Cs), tT Cs/G, and tq tTCe /(Ce Cs) < Ce /G V tT. These equations are identical to the lagging heat equations and can be solved with appropriate boundary conditions. The results again belong to the regime of over-diffusion, or parallel conduction, without any wavelike features. Cooling caused by thermionic emission is usually neglected, and the surface under illumination can be assumed adiabatic. A 1-D approximation further simplifies the problem. The solution follows the general trends depicted in Fig. 7.4. The situation will be completely changed if a phase change occurs or if the system is driven to exceed the linear harmonic behavior.15,25 The term tq is clearly not the same as the relaxation time t due to collision. The resulting solution is more diffusive than wavelike. In the literature, tq is commonly referred to as the thermalization time. The physical meaning of tq is a thermal time constant for the electron system to reach an equilibrium with the phonon system. For noble metals at room temperature, the relaxation time t is on the order of 30 to 40 fs, the thermalization time tq is 0.5 to 0.8 ps, and the retardation time tT is 60 to 90 ps. In practice, we need to consider the temperature dependence of the parameters in Eq. (7.21), as mentioned earlier. Some numerical solutions, considering temperature dependence, and comparisons with experiments can be found from Smith et al.33 and Zhou and Chiu.34 Given that the two-temperature model cannot be applied to t t, due to the limitation of Fourier’s law, one may prefer to use a pulse width tp between 100 and 200 fs and measure the response during several picoseconds until the thermalization process is complete, i.e., the electron and phonon temperatures become the same. This first-stage measurement allows the determination of the coupling constant G. In the case of a thin film, the TBR sets a barrier for heat conduction between the film and the substrate. The time constant of the film can range from several tens to hundreds of picoseconds. Therefore, the TBR between the film and the substrate can be determined by continuing the observation of thermoreflectance signals for 1 to 2 ns after each pulse. Fitting the curves in the second-stage measurement allows an estimate of the TBR. Of course, one could use a longer pulse width tp to determine the TBR. Most advanced femtosecond research laboratories are equipped with Ti:sapphire lasers whose pulse widths range from 50 to 500 fs. Femtosecond lasers with a pulse width of 25 fs have also been used in some studies; see for example Li et al. (J. Opt. Soc. Am. B, 15, 2404, 1998; Phys. Rev. Lett., 82, 2394, 1999). For tp below 50 fs, Eq. (7.21a) is not applicable during the heating, at least for noble metals. The relaxation time for Cr is about 3 fs, and Eq. (7.21) can be safely applied even with tp 10 fs. However, the processes below 20 fs may largely involve electron-electron inelastic scattering, thermionic emission, ionization, phase transformation, chemical reaction, and so forth. Other difficult issues associated with the reduced pulse width include widened frequency spectrum, increased pulse intensity, decreased pulse

262

CHAPTER 7

energy, and so forth. A simple hyperbolic heat flux formulation cannot properly address these issues at tp t. One must investigate the physical and chemical processes occurring at this timescale in order to develop a physically plausible model, with or without the concept of effective temperatures. Femtosecond laser interactions with dielectric materials have also been extensively studied (see Jiang and Tsai35 and references therein). Let us reiterate the major points presented in this section: (a) Fourier’s law, which is limited to local equilibrium conditions, does not predict an infinite speed of heat diffusion, nor does it violate the principle of causality. An instantaneous response at a finite distance is permitted by quantum statistics although the probability of such a response sharply approaches zero as the distance increases. An instantaneous temperature change or heat flux at a precise location is not physically possible. Only under the continuum assumption, we can use the concept of sudden change of temperature at the boundary. (b) Heat diffusion is usually a very slow process, compared with the speed of sound. The temperature wave, or the second sound, has been observed only in helium and some very pure dielectric crystals, at low temperatures, where the U processes are ballistic and the N processes have a very high scattering rate. However, the simple hyperbolic heat equation has been proved neither theoretically nor experimentally. There is no need to collect previous or future experimental evidence to test the hyperbolic heat equation, which was ill-formulated in the first place. (c) All kinds of non-Fourier equations are based on some sort of effective temperature, which are not measurable using a contact thermometer. The principle of contact thermometry is the zeroth law of thermodynamics, which originates from the theory of thermal equilibrium. The concept of coldness or hotness should be abandoned in reference to nonequilibrium energy transport processes. Noncontact thermometry, on the other hand, relies on certain physical responses to deduce the equilibrium temperature or the effective temperature of the system being measured. (d) The memory hypothesis and the lagging argument are phenomenological models that may be useful in the study of certain nonequilibrium or parallel conduction processes, but are not universally applicable. These and similar equations must be derived and applied on a case-by-case basis. It is important to understand the microscopic processes occurring at the appropriate length scales and timescales in order to develop physically reliable models.

7.2 HEAT CONDUCTION ACROSS LAYERED STRUCTURES In Sec. 5.5.2, we have given a detailed discussion on the heat conduction along a thin film using the BTE, under the local equilibrium assumption. An effective thermal conductivity can be used after taking proper account of boundary scattering. The heat conduction problem can thus be well described by Fourier’s law using the effective thermal conductivity. As mentioned earlier, for heat transfer across a film or a superlattice, the condition of local equilibrium breaks down in the acoustically thin limit. The local distribution function cannot be approximated by an equilibrium distribution function at any temperature. Conventional Fourier’s law breaks down because it relies on the definition of an equilibrium temperature and the existence of local equilibrium. It is natural to ask the following two questions: (1) Is it possible for us to define an effective temperature? (2) Can Fourier’s law still be useful in the nonequilibrium regime, according to the effective temperature? This section presents the equation of phonon radiative transfer (EPRT) and the solution of EPRT for thin films under the relaxation time approximation. A resistance network representation is present to illustrate how Fourier’s law of heat conduction may be applied inside the medium, at least approximately, with temperature-jump boundary conditions. Because of the importance of understanding the boundary conditions, this section also discusses models of thermal boundary resistance (TBR) in layered structures.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

263

7.2.1 Equation of Phonon Radiative Transfer (EPRT) The phonon BTE under the relaxation time approximation, in a region with heat generation, may be written as f f0 'f 'f (7.28)

v#

S0 't 'r t(v,T ) where the second term S0 on the right-hand side is a source term to model the generation of phonons due to heat dissipation, such as electron-phonon scattering. Phonon-phonon scattering is already included in the first term on the right-hand side. The scattering rate may also include phonon-defect scattering. Many studies have treated phonon transport in analogy to thermal radiative transfer.12, 36–43 In the following, a simplified case is used to illustrate how to model heat transfer across a thin film as well as multilayer structures. Let us consider a film of thickness L between two boundaries without any internal source. The phonon BTE becomes f0 f 'f 'f

vx t 't 'x

(7.29)

Realizing the nonequilibrium distribution function may be anisotropic, let us define Iv(x,,t)

1 vg Uv f D(v) 4p a P

(7.30)

where P is the number of phonon modes or polarizations. Equation (7.30) gives the phonon intensity, which is the energy transfer rate in the direction from a unit area, per unit frequency and per unit solid angle. The geometry of the problem and illustration of the intensity is given in Fig. 7.5. In this section, we use vg for the group velocity and vp for the

q''w T1 +

Iw

_ Iw q

q

x

I ∗w

T2

FIGURE 7.5 Schematic of phonon radiative transfer inside a dielectric medium between two walls maintained at temperatures T1 and T2. These walls are like heat reservoirs, but their surfaces are not necessarily blackbodies.

phase velocity. Note that vx vgcos u, where u is the polar angle. Substituting Eq. (7.30) into Eq. (7.29), we obtain 'Iv I*v Iv 1 'Iv vg 't m 'x vgt

(7.31)

264

CHAPTER 7

where m cos u and Iv* (v,T) is the intensity for equilibrium distribution that is independent of the direction. Equation (7.31) is called the equation of phonon radiative transfer (EPRT).12,40 Comparing the EPRT with the ERT given in Eq. (2.52), we see that the scattering terms are neglected in the EPRT, and the emission and the absorption are replaced by the phonon collision terms. The phonon mean free path vgt is also called the phonon penetration depth (see Example 4-2). The inverse of the penetration depth 1/ corresponds to the absorption coefficient in the ERT. Conversion to the EPRT allows wellestablished theories and numerical techniques, developed in radiative transfer, to be applied to solve Eq. (7.31) and to interpret the physical significance of the solutions.44,45 If t does not depend on frequency, we are dealing with a gray medium. If the phonon Knudsen number Kn /L V 1, then most phonons will collide with phonons or defects inside the medium. This regime is called the acoustically thick limit, in analogy to the optically thick limit for photons. This is also known as the macroscale regime or the local equilibrium situation. Unless at a very short timescale, when a sudden local disturbance occurs, we expect that Fourier’s law is applicable and the heat conduction is by diffusion. On the other hand, if Kn /L W 1, phonons originated from one boundary will most likely reach the other boundary without colliding with other phonons or defects inside the medium. This is the ballistic regime, corresponding to free molecule flow for molecular gases. This regime is called the acoustically thin limit, where the phonon distribution inside the medium cannot be characterized by an equilibrium distribution function if the walls are at different temperatures, even in the steady state. Because we are dealing with the radiative transfer for phonons as we do for photons, from now on, we will refer Kn /L W 1 as the radiative thin limit and Kn /L V 1 as the radiative thick limit. Because the BTE is more fundamental than Fourier’s law, it works for either limit as well as between the two limits. It would be very useful if a macroscopic model can also be developed to bridge these two limits. Rather than referring readers to more specialized journal papers, in the following, we present some basic formulations that are logically connected with materials presented in earlier chapters. Note that Iv0 is the equilibrium distribution function, which is independent of the direction. Using Bose-Einstein statistics, we have vgUv k2 dk Uv3 I*v(v,T ) a Uv/k T a 3 2 Uv/k T 3 dv B B (2p) e 1 8p v (e 1) P P p

(7.32)

This equilibrium distribution is also the distribution function for blackbody radiation with vp replaced by the speed of light. Integrating Eq. (7.32) over all frequencies gives the total intensity for all three phonon modes as follows: ` srSBT 4 3k4BT 4 ` x3dx I *(T ) 3 Iv* (v,T )dv x p 8p3U3v2a 30 e 1 0

(7.33)

where srSB p2k4B/(40U3v2a) is the phonon Stefan-Boltzmann constant, and va is the average phase velocity of the two translational and one longitudinal phonon modes, defined according to Eq. (5.7). Let us consider a solid at temperatures higher than the Debye temperature. The integration can be carried out to an upper limit vm with xm Uvm/kBT V 1. From the discussion following Eq. (5.13), one can easily show that vm v3mkB T I *(T ) 3 Iv*(v,T )dv 8p3v2p 0

(7.34)

This integration is a good approximation, even at temperatures slightly lower than the Debye temperature. When phonons are at equilibrium, the energy flux is pI*, which is

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

265

obtained by integrating I *cos u d over the hemisphere. According to Eq. (4.12), the energy density can be expressed as 4p u(T ) v I *(T ) g

(7.35)

Note that the volumetric heat capacity C du/dT . We therefore obtain the low-temperature relation of the specific heat, i.e., the T 3 law, and the high-temperature relation of the specific heat, i.e., the Dulong-Petit law, as already derived in Sec. 5.1.2. It is important to pay attention to the meaning of C in the kinetic expression of thermal conductivity: k

1 2 Cv t 3 g

(7.36)

At very low temperatures, when T V D, C is the volumetric heat capacity of all phonon modes combined because only low-frequency modes or acoustic branches contribute to the specific heat. However, at temperatures close to the Debye temperature, phonons in the optical branches contribute little to the thermal conductivity, as already discussed in Chap. 6. The relative contributions of LA and TA branches are also temperature dependent. The Debye temperature for most materials, except diamond, is not much higher than room temperature (see Table 5.2). Therefore, we must treat C as a fraction of the volumetric specific heat in dealing with Si, GaAs, Ge, ZnS, or GaN, near room temperature. Also, we must use the appropriate upper limit in the integral in calculating the total energy transfer when applying the EPRT. The heat flux per unit frequency interval can thus be expressed as qsv 3 Ivcos u d 2p 3 4p

1

1

Iv mdm

(7.37)

Energy balance at any given location requires that the incoming flux be the same as the outgoing flux, for both steady and transient states. This is the criterion for radiative equilibrium, which can be expressed as follows: vm 1

vm

1 * 1 4p 3 Iv dv 2p 3 3 Ivdmdv v 0 v

(7.38)

0 1

where v is the mean free path at v, 4p on the left-hand side came from the integration over all solid angles in a sphere, and 2p on the right-hand side came from integration over the azimuth angles. Equation (7.38) gives a definition of an effective phonon temperature T * based on Iv* (T *,v). An equivalent expression can be obtained based on the energy density, viz., u(T * ) a a Uvf(v,) P

(7.39)

K

It follows that the local equilibrium condition can be rewritten as 1

Iv*

1 I dm 2 31 v

(7.40)

Local equilibrium is a sufficient, but not necessary, condition for radiative equilibrium given in Eq. (7.38), regardless whether the medium is gray or not. The physical significance of Eq. (7.40) is that the angular average of the intensity, at a given location and time, can be described by an equilibrium intensity at the effective temperature. Obviously, Eq. (7.40) is not applicable in the radiative thin limit, unless the temperature difference between the two boundaries is negligibly small.

CHAPTER 7

266

For a dielectric medium of thickness L 0.01, where is independent of wavelength. The boundary or wall temperatures are T1 100 K and T2 20 K. Both the temperatures are much lower than the Debye temperature. Assume that reflection at the boundaries is negligible, i.e., the walls can be modeled as blackbodies. Find the steady-state temperature of the medium and the heat flux through the medium.

Example 7-3.

Because Kn /L W 1, the medium is said to be in the radiative thin limit, in which phonons travel from one wall to another ballistically with little chance of being scattered by other * phonons or defects inside the medium. The forward intensity can be expressed as I

v Iv(T1,v) for * for . From Eq. (7.37), we have I m 0, and the backward intensity I (T ,v) m 0 v v 2 Solution.

` 1

`

4 4 qsx 3 qsv dv 2p 3 3 (I

v Iv )mdmdv srSB (T1 T2 )

0

(7.41)

0

0

For heat conduction, the above equation is called the Casimir limit (Physica, 5, 595, 1938). To numerically evaluate this equation, we need data for va. From Eq. (7.38), we have v

srSB T 4

p m

1 (I I (sr T 4 srSB T24) v )dv 2 30 v 2 SB 1

(7.42)

where T is the effective temperature inside the medium 0 x L. Since T(0) T1 and T(L) T2 are the boundary conditions, there is a temperature jump at each boundary. We notice immediately that Eq. (7.40) cannot be satisfied with the temperature defined previously. If we force 1

I*v

1 1

I I dm (I

v) 2 31 v 2 v

(7.43)

we would end up with different temperatures at each frequency. In the next chapter (Sec. 8.2.3), we will further discuss the concept of monochromatic temperature. If the walls are not black but diffuse-gray with emissivities e1 and e2, similar to Eq. (2.51), the heat flux becomes qsx

srSB T 41 srSBT24 1/e1 1/e2 1

(7.44)

7.2.2 Solution of the EPRT The two-flux method is very helpful in developing a solution of the EPRT in planar structures. The equations for the forward and backward intensities, denoted respectively by superscripts ( ) and () can be separated. In the steady state, we have: 'I

I*v I

v v , when 0 m 1 'x 'I I*v I v v , when 1 m 0 m 'x

m

(7.45a) (7.45b)

where we have assumed that the medium is gray.44,45 If we further assume that the walls are diffuse and gray, then the boundary conditions become T(0) T1 and Thus,

T(L) T2

(7.46)

* I

v (0, m) e1Iv(T1) (1 e1)Iv (0, m)

(7.47)

*

I v (L, m) e2Iv(T2) (1 e2)Iv (L, m)

(7.48)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

267

The solutions of Eq. (7.45a) and Eq. (7.45b) can be expressed as follows: x x j dj x b 3 Iv* (j) exp a b m m m 0

for m 0 (7.49)

L x j dj Lx b 3 Iv* (j) exp a b m m m x

for m 0 (7.50)

I

v (x,m) Iv (0,m)expa

and I v (x,m) Iv (L,m)expa

In Eq. (7.49), the first term represents intensity originated from the left surface, after being attenuated, and the second term is the contribution of generation that is subject to attenuation as well. Equation (7.50) is viewed reversely for intensity from the right to the left. The spectral heat flux, defined in Eq. (7.37), can be written as x Lx qsv 2p 3 cI

b I b d mdm v (0,m) exp a v (L,m) exp a m m 1

0

2p 3

0

Here again, Em(x) fuse, then

Iv* (j)E2 a

x

1 m2 x/m 10 m e

L x j dj j x dj b 2p 3 Iv* (j)E2 a b x

(7.51a)

dm is the mth exponential integral. If the surface is dif-

Lx x qsv 2pI

b v (0)E3 a b 2pIv (L)E3 a L j x dj x j dj b 2p 3 Iv* (j)E2 a b

2p 3 Iv* (j)E2 a x 0 x

(7.51b)

dqsx vm ' Note that energy balance requires that 10 qs (x,v)dv 0. Differentiation of 'x v dx Eq. (7.51a) yields 'qsv x Lx 2p 2p (0)E2 a b I (L)E2 a b I

'x v v L Zx jZ dj 2p 4p * 3 Iv* (j)E1 a b

I (x) 0 v

(7.52)

In radiative transfer, we call J1 1pI

v (0)dv and J2 1 pIv (L)dv the total radiosities at surfaces 1 and 2, respectively, and eb(T ) 1pIv* dv the total blackbody emissive power. Therefore,

L Zx jZ dj x Lx 2eb(T(x)) J1E2 a b J2E2 a b 3 eb(T(j))E1 a b 0

(7.53)

This is the same as the radiative equilibrium condition, given in Eq. (7.38). We cannot set Eq. (7.52) to zero at all frequencies, when local equilibrium does not exist, even for a gray medium. Find the temperature distribution, the heat flux, and the thermal conductivity for a gray medium, with diffuse-gray surfaces, in the radiative thick limit, i.e., Kn V 1, under two extreme conditions: (1) T1, T2 V D and (2) T1, T2 D. Example 7-4.

CHAPTER 7

268

In the radiative thick limit, the first two terms in Eq. (7.51a) can be dropped as long as x is not too close to either surface. Applying the first-order Taylor expansion I *"(x) I *"(j)

(dI*v /dx)(x j) c and letting z (x j)/ in the third and fourth terms, we obtain Solution.

qsv 4p

* 'Iv* ` 4p 'Iv zE (z)dz 'x 30 2 'x 3

(7.54)

since 10 zE2(z)dz 1/3. In fact, this equation applies to everywhere inside the medium because the spectral heat flux is continuous in the radiative thick limit. Integrating Eq. (7.54) over the frequencies of interest, we see that, under condition (1), `

qsx

16srSBT 3 dT , when T V D 3 dx

(7.55a)

This is nothing but a heat diffusion equation if we define the thermal conductivity as k(T ) (3/16)srSBT 3

(7.55b) 9 Cvg 16 srSB T 3.

Comparing with k(T) 31 Cvg , we notice from the previous equation that In the radiative thick limit, the temperature distribution is continuous at the wall, i.e., T(0 ) T(0) T1 and T(L) T(L) T2. Furthermore, the radiosity at the wall becomes the blackbody emissive power, even though the surface is not black; thus, we can integrate Eq. (7.54) over x from 0 to L: L

T

2 4 3 3 qsx dx 3 srSB 3 4T dT 0 T 1

which gives

4 qsx KnAsrSBT14 srSBT24 B 3

(7.56a)

as well as the temperature distribution:

T(x) CT14

x 4 AT T24 BD 1/4 L 1

(7.56b)

which is linear in terms of the fourth power of temperature. From the definition of thermal resistance, qsx (T1 T2)/Rst , we have Rst

3(T1 T2)(T12 T23) 4srSBKn

(7.57)

Under condition (2), when the temperature is greater than the Debye temperature, we have qsx

v3mkB 6p2v2p

dT dx

when T D

(7.58)

The thermal conductivity becomes k(T) v3mkB /(6p2v2p), which implies that Cvg v3mkB/(3p2v2p). A proper vm should be chosen so that only propagating phonons are considered. Assuming that the temperature difference is small so that we can approximate the thermal conductivity as a constant, we have qsx

CvgKn 3

(T1 T2)

(7.59)

The thermal resistance becomes Rst 3/(CvgKn), which increases as L increases. The temperature distribution is linear. One should realize that the scattering rate increases with temperature, due to phonon-phonon scattering, and depends on the frequency. If we look at the radiative equilibrium * condition again, by assuming T1 T2, we see that I

v Iv Iv . Therefore, local equilibrium is not

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

269

a stable-equilibrium state. In the radiative thick limit, the difference between I

v and Iv is caused by the spatial variation of I*v as can be clearly seem from Eq. (7.49) and Eq. (7.50). Hence, Eq. (7.40) is a good approximation. In the radiative thin limit, according to Eq. (7.34), Eq. (7.59) becomes

qsx

1 Cv (T T2) 4 g 1

(7.60)

Although no closed form exists for the solution of the ERT between the thick and thin limits, a number of approximation techniques and numerical methods can be used to provide satisfactory solutions, such as the discrete ordinates method (SN approximation) and the spherical harmonics method (PN approximation). It is important to see that except in the radiative thick limit, energy transfer occurs inside the medium in two ways: one is through exchange with the walls, and the other is through diffusion. For this reason, a ballisticdiffusion approximation has been developed to solve the EPRT; see Chen (Phys. Rev. Lett., 86, 2297, 2001). In general, the temperature distribution looks like that in Fig. 4.12b if T2 is comparable to the Debye temperature. If T1 V D, then the temperature distribution can be plotted in terms of T 4 so that the distribution looks more or less linear. There exists a temperature jump such that T(0 ) 2 T(0) and T(L) 2 T(L), except in the radiative thick limit. Understanding that the temperature is only an effective temperature and given such a temperature distribution, one may assume that there is a thermal resistance at each boundary and an internal thermal resistance, which may be described by Fourier’s heat conduction.41 For thermal radiative transfer in the absence of heat conduction, there exists a radiation slip or radiation jump at the boundary, unless the medium is optically thick. Without a participating medium, photons do not scatter on itself to dissipate heat or transfer heat by diffusion. This is a distinction between photons and phonons. Radiation slip is manifested by a discontinuous change of the intensity at the boundary. The temperature in the medium adjacent to the wall differs from the surface temperature. Such a temperature jump does not exist in classical Fourier’s heat conduction theory; however, both velocity slip and temperature jump have already been incorporated in microfluidics research, as discussed in Chap. 4 [see Eq. (4.94)]. The temperature-jump concept was first applied in the study of heat conduction in rarefied gases over 100 years ago. A straightforward approach for phonon transport is to sum up the thermal resistances in the radiative thin and thick limits. The heat flux at very low temperatures can be expressed as qsx

4 3L

srSBT14 srSBT24 1 1 1 1 4Kn 1 ae e b 2 2 3 1 2

(7.61)

Here, we separately write (1/e1 1/2) and (1/e2 1/2) to emphasize the thermal resistance due to radiation slip at each boundary. In the radiative thick limit, the temperature jump approaches to zero as Kn S 0. Basically, Eq. (7.61) reduces to Eq. (7.44) and Eq. (7.56a), in the extremes. If the walls can be treated as blackbodies, i.e., e1 e2 1, and the temperature difference between T1 and T2 is small, we can approximate the heat flux as follows: qsx

kb

T

T keff L 1 4Kn/3 L

where T T1 T2 V T2 T1, the bulk thermal conductivity kb(T) the effective conductivity of the film is keff

kb 1 4Kn/3

(7.62) 16 3 3 srSBT ,

and

(7.63)

270

CHAPTER 7

At relatively high temperatures close to the Debye temperature, from Eq. (7.59) and Eq. (7.60), we can write kb T1 T2 T1 T2 qsx keff L L (7.64) 4Kn 1 1 1 a e e 1b 3 1 2 where kb(T) 31Cvg . Equation (7.64) gives the same conductivity ratio keff /kb as in Eq. (7.63) for blackbody walls. These effective thermal conductivities are on the same order of magnitude as we have derived in Sec. 5.5.5, based on simple geometric arguments and Matthiessen’s rule for the mean free path given in Eq. (5.116). In previous chapters, however, we did not elaborate in detail the nature of nonequilibrium and the necessity of defining an effective temperature. It is interesting that different schools of thought can result in rather consistent results. The heat diffusion equation per se cannot tell us the cause of a temperature jump or how to evaluate it. The phonon BTE enables us to explore the microscopic phenomena and helps evaluate the parameters and the properties. The microscopic understanding and the macroscopic phenomenological equations can indeed work together to provide an effective thermal analysis tool. The preceding discussions are consistent with the detailed derivation of the temperature jump or the radiation slip, originally formulated by Deissler (J. Heat Transfer, C86, 240, 1964), for situations not too far from the radiative thick limit. Nevertheless, the expressions given in Eq. (7.61) and Eq. (7.64) can be approximately applied between the diffusion and ballistic extremes. It should be noted that when the temperature jump is treated as a thermal resistance at the boundary, Fourier’s law can be used for the heat conduction inside the medium with bulk thermal conductivity. This is very different from heat conduction along the film. While there seems to be no problem in understanding the meaning of emissivity for optical radiation, a question still remains as how to interpret the boundary conditions in the case of phonon conduction. If a multilayer structure is considered, we need to better understand the reflection and the transmission of phonons at the interfaces between dissimilar materials. A three-layer structure is shown in Fig. 7.6 to illustrate the temperature distribution

FIGURE 7.6 Temperature distribution in a multilayer structure, with thermal boundary resistance, and the thermal resistance network representation. Here, Ri is the internal resistance in the ith layer due to heat conduction, and Rij is the thermal boundary resistance between the ith and jth media. Two temperatures are needed to specify the effective temperature of different media at the interface.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

271

in a multilayer structure. Depending on the temperature range, it seems that we can conveniently determine the internal thermal resistance with Fourier’s law, i.e., Rsi Li /ki. For the thermal resistance at the interface inside the layered structures, we could replace the 1 4 1 1 emissivity with the transmissivity ij such that Rsij 34(i /ki)(1 ij 2 ) 3 (j /kj)(ji 2 ). 4 1 1 4 1 1 At the boundaries, we can still use RsH1 3(1/k1)(e1 2) and Rs3L 3(3/k3)(e3 2). The heat flux can be estimated by qsx (TH TL)/Rstot, where Rstot is the sum of all thermal resistances. The effective thermal conductivity of the whole layered structure becomes keff Ltot/Rstot. The details were presented by Chen and Zeng, who further considered nondiffuse surfaces and defined equivalent equilibrium temperatures.41 The assumption is that the deviation from the radiative thick limit is not significant. If we are dealing with the ballistic regime, we might need to consider phonon wave effects as well as the quantum size effect. Models for thermal boundary resistance will be discussed in the next subsection. It is intriguing to apply the same approach to electron systems for the study of both electrical conductivity and thermal conductivity of metallic solids, as well as metaldielectric multilayer structures. The thermal resistance network method, however, cannot be easily extended to multidimensional problems or to transient heating by a localized heat source. Statistical models or atomistic simulations are necessary. Therefore, the extension of Fourier’s law for 1-D nonequilibrium heat transfer should be considered only as a special case.

7.2.3 Thermal Boundary Resistance (TBR) Thermal resistance at the interface between dissimilar materials is very important for heat transfer in heterostructures. Let us first clarify the difference between thermal contact resistance and thermal boundary resistance (TBR). The former refers to the thermal resistance between two bodies, usually with very rough surfaces whose root-mean-square roughness srms is greater than 0.5 m, brought or joined together mechanically. For thermal contact resistance, readers are referred to a recent comprehensive review by Yovanovich.46 Originally, TBR refers to the resistance at the interface between two solids or between a liquid and a dielectric at low temperatures. Even when the materials are in perfect contact with each other, reflections occur when phonons travel toward the boundary, because of the difference in acoustic properties of adjacent materials. In practice, the interface can be atomically smooth, or with a roughness ranging from several tenths of a nanometer to several nanometers. The thermal resistance between a solid material and liquid helium is called the Kapitza resistance, first observed by the Russian physicist and 1978 Nobel Laureate Pyotr Kapitza, in the 1940s. This thermal resistance results in a temperature discontinuity at the boundary and has been modeled, based on the acoustic mismatch model (AMM). Thermal boundary resistance exists between two dielectrics as well as between a metal and a dielectric. In a thin-film structure, an interface is often accompanied by the formation of an intermediate layer of mixed atoms. An extensive review of earlier studies can be found in the work of Swartz and Pohl in 1989;38 see also Stoner and Maris (Phys. Rev. B, 48, 16373, 1993). Prasher and Phelan (J. Supercond., 10, 473, 1997) reviewed the studies of TBR of high-temperature superconductors in both the normal and superconducting states, for applications in superconducting electronics and radiation detectors. Little showed that the heat flux across the boundary of a perfectly joined interface between two solids is proportional to the difference in the fourth power of temperature on each side of the interface.39 This can be understood based on previous discussions of phonon radiative transfer and blackbody radiation. Consider longitudinal phonon modes that follow the linear dispersion in a Debye crystal, and assume that the interface is perfectly smooth.

CHAPTER 7

272

At any given frequency, the transmission coefficients can be written as follows (with a small modification for consistency):39,47 t12 t21

4r1r2v2l1 cos u1 cos u2 (r1vl1 cos u2 r2vl2 cos u1)2 4r1r2v2l2 cos u1 cos u2 (r1vl1 cos u2 r2vl2 cos u1)2

(7.65a) (7.65b)

where subscripts 1 and 2 denote the media 1 and 2, respectively, r is the density, vl is the propagation speed of longitudinal phonons, and u is the polar angle, as illustrated in Fig. 7.7. The scattering is assumed to be purely elastic since the phonon frequency is conserved. An analog of Snell’s law can be written as follows: 1 1 (7.66) vl1 sin u1 vl2 sin u2

FIGURE 7.7 Schematic of phonon transport across an interface between two semi-infinite media, each at a thermal equilibrium.

If vl1 vl2, for incidence from medium 2 to 1, there exists a critical angle uc sin1(vl2/vl1), beyond which all phonons will be reflected. Due to the boundary resistance, there will be a temperature difference across the interface. By assuming that the phonons are at equilibrium on either side, the heat flux from medium 1 to 2 can be expressed as follows: vm 2p p/2

qs1S2

1 Uvvl1 f1(v, T1)t12D(v)cos u1sin u1du1df1dv 4p 3 3 3

(7.67)

0 0 0

If the distribution function is isotropic over the hemisphere, we have v

qs1S2

1 12 m Uvv3l1 f1(v, T1)D(v)dv 4 v2l1 30

2p p/2

where

1 12 p 3 3 t12 cos u1sin u 1du1df 2 3 0 0

(7.68)

p/2

0

t12 cos u1sin u1du1

(7.69)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

273

can be viewed as the hemispherical transmisivity. Note that 21 v2l2

uc t21 12 2 3 2 cos u2 sin u2du2 2 vl1 0 vl2

(7.70)

For the Debye density of states, we have Uv3 1 n Uvf (v,T)D(v)dv 4p l 8p3v2l (eUv/kBT 1) Therefore, the net heat flux across the interface becomes v

1 12 m Uv S v3l1 f1(v,T1) v3l2 f2(v,T1) T D(v)dv 4 v2l1 30

qsx qs1S2 qs2S1

or

qsx

m2 m1 x3dx x3dx 4 4 aT T x 2 1 30 ex 1 b 3 e 1 v2l1 8p2U3 0

12 k4B

(7.71)

x

x

(7.72)

where xmj Uv/kBTj. In the low-temperature limit, we obtain qsx

12 p2k4B v2l1 120 U3

AT14 T24 B

(7.73)

2 2 2 After replacing v2 l1 with a vj1 vl1 2vt1 , i.e., one longitudinal and two transverse phonon modes, we obtain j

qsx

p2kB4 120 U3

AT14 T24 B12 a v2 j1

(7.74)

j

The TBR can now be obtained as Rsb (T1 T2)/qsx. Furthermore, by assuming that the temperature difference is small, we can approximate Rsb by Rsb

30U3T3

p2k4B12 a v2 j1

(7.75)

j

which is inversely proportional to T 3. The characteristic wavelength is the most probable wavelength in the phonon distribution function. It can be approximated by lmp < a

D T

(7.76)

where a is the lattice constant, on the order of 0.3 to 0.6 nm.47 Only when lmp W srms, can we assume that the scattering is completely specular. Even for atomically smooth interfaces, the characteristic wavelength for phonons will be on the same order of magnitude as the rms surface roughness, when the temperature approaches the Debye temperature. The specularity parameter introduced in Eq. (5.131) is often used to approximate the fraction of specular reflection with respect to the total reflection. Another expression of the specularity parameter is p exp a

16p2s2rms l2

b

(7.77)

CHAPTER 7

274

This equation has often been wrongly expressed with p2 being mistaken as p3 in the heat conduction literature, following a hidden typo in Ziman’s book, Electrons and Phonons.48 In the high-temperature limit, TBR is expected to be small, especially when compared with conduction in the solids. Other considerations are (a) the interface may not be perfectly smooth, (b) there exists an upper limit of the frequency or a lower limit of wavelength, and (c) phonons on either sides of the boundary may not be in a local-equilibrium state. These difficulties post some real challenges in modeling TBR. Nevertheless, we shall present the diffuse mismatch model (DMM) that was introduced by Swartz and Pohl.38 In the DMM, it is assumed that phonons will be scattered according to a probability, determined by the properties of the two media but independent of where the phonons are originated. For phonons coming from medium 1, the transmission and reflection probabilities are related by 12 R12 1. For phonons originated from medium 2, on the other hand, 21 R12 and R21 12. Hence, the reciprocity requires that (7.78a)

12 21 1 We can rewrite Eq. (7.70), considering all three polarizations, as follows: 2 12 a v2 j1 21 a vj2

(7.78b)

j

j

Solving Eq. (7.78a) and Eq. (7.78b), we get a vj2

2

12

j

2 2 a vj1 a vj2

j

(7.79)

j

The heat flux can be calculated according to qsx

k4B 2 3

8p U

aT14 3

m2 x3dx x3dx T24 3 b12 a v2 x j1 0 e 1 j 0 e 1

xm1

x

x

(7.80)

Equation (7.79) and Eq. (7.80) are the only equations needed to calculate TBR with the DMM. In addition to the Debye temperatures and the speeds of longitudinal and transverse waves, one would need to determine the upper limits of the integrals in Eq. (7.80). Alternatively, Eq. (7.80) can be recast using the volumetric heat capacities and the group velocities to obtain 1 qsx AC1vg1T1 C2vg2T2 B12 4

(7.81)

One must be careful in applying the heat capacity in Eq. (7.81) since the heat capacity in the expression of thermal conductivity is different from rcp, unless at very low temperatures. Both the AMM and the DMM assume that the phonons are in equilibrium on each side of the interface, and do not take into account the nonequilibrium distribution of phonons. In multilayer thin films, especially in quantum wells and superlattices, when the film thickness is comparable with or smaller than the phonon mean free path, thermal transport inside the film cannot be modeled as pure diffusion anymore. A detailed treatment of temperature-jump conditions and boundary resistance in superlattices was performed by Chen and Zeng.40,41 Majumdar (J. Heat Transfer, 113, 797, 1991) proposed a modified AMM by modeling interface roughness using a fractal structure. In this study, the reflection was approximated by geometric optics, which is applicable when the phonon wavelength is smaller than the autocorrelation length of the rough surface. TBR between highly dissimilar materials, metal-metal interface, and metal-dielectric interface have been the areas of some recent studies; see Majumdar and Reddy (Appl. Phys. Lett., 84, 4768), Ju et al.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

275

(J. Heat Transfer, 128, 919, 2006), Lyeo and Cahill (Phys. Rev. B, 73, 144301, 2006), and Gundrum et al. (Phys. Rev. B, 72, 245426, 2006). Let us consider how to model transient heat conduction in thin films. The single relaxation time approximation appears to be limited to the timescale t t. Joshi and Majumdar (J. Appl. Phys., 74, 31, 1993) performed a transient analysis of the EPRT. However, the use of Eq. (7.43) implies the presumption of the local-equilibrium condition, which is not valid for large temperature gradients. If worked out properly, the EPRT is applicable for t t. If the temperature jumps can be properly taken into consideration, it appears that Fourier’s law should be applicable when t W t. The question is, “How long will it take for the temperaturejump conditions to be justified?” In order to model a timescale less than t, it would be interesting to see if there exists another scattering mechanism that has a much smaller relaxation time than t and that does not transfer or dissipate heat, like the N processes discussed earlier or phonons in the optical branch. In the two-fluid model, the superfluid moves forward freely, without any viscosity, but conserves kinetic energy as it moves around. This is the principle of superfluidity in liquid helium and superconductivity for electrons. The superfluid does not carry thermal energy, nor does it dissipate heat. Although the N processes do not carry heat forward, these processes are important for the redistribution of phonons. The two-relaxation-time model developed by Callaway21 and Guyer and Krumhansl22, or the Jeffrey-type equation, might be applicable in extreme cases, e.g., in a nanotube at very small timescales, on the order of femtoseconds. The solution describes a wavelike characteristic that is a combination of a damped wave and a weak diffusion process, which enables an instantaneous response, intrinsic to all heat conduction processes, as justified by statistical mechanics. As mentioned earlier, the wavelike behavior has been demonstrated recently in SWNTs, via molecular dynamics simulation, although a lot of work needs to be done to extend the simulation to multilayer structures. Let us emphasize again that Fourier’s heat diffusion appears to be universal for heat conduction, and the hyperbolic heat equation, Eq. (7.5), can be neither physically justified nor practically useful. In the extremely acoustic thin limit, we are dealing with quantum conductance or the Schrödinger wave equation. This wave phenomenon cannot be explained by the hyperbolic heat equation.

7.3 HEAT CONDUCTION REGIMES There has been a continuous effort to delineate the regimes of microscale heat conduction since 1992. A number of references have already been cited in Chap. 5. A recent effort has been made by Escobar et al.43 Following the previous discussions in this chapter, let us schematically depict the regimes of heat conduction, especially by electrons and phonons in crystalline solids, as in Fig. 7.8. Here, tc is known as effective collision interaction time, or simply collision time, since collision does not occur instantaneously but is through intermolecular potential and force interactions. These forces become important only when the particles come very close to each other. Of course, this is the classical picture of atomic or molecular interactions. Electrons and phonons are quantum mechanical particles; thus, the interaction is via the wavefunctions predicted by Schrödinger’s equations. For ultrafast pulse heating, the collision time can be the time required for a photon and an electron to interact. Generally speaking, the collision time is much shorter than the relaxation time and neglected in the BTE. The characteristic phonon or electron wavelength l is assumed to be less than the mean free path . Region 1 is the macroscale regime where Fourier’s law and the heat diffusion equation can be applied, when the timescale is greater than t and the length scale is greater than about 10. Region 2 is called the mesoscale or quasi-equilibrium regime, which is characterized by the classical size effect. This region is also known as the first microscale. For

276

CHAPTER 7

FIGURE 7.8 Heat conduction regimes.

heat transfer along a film or a wire, local-equilibrium assumption is appropriate and boundary scattering reduces the effective mean free path and thermal conductivity. For heat transfer across a film or a multilayer, it is possible to use Fourier’s law inside the medium by considering an effective temperature and the temperature-jump boundary condition. It is difficult, if not impossible, to apply Fourier’s law to complex geometries or local heating. The two-temperature model for fast laser heating can be in either region 1 or 2, depending on how the length scale is compared with the mean free path. Most of the research on microscale heat transfer between 1990 and 2005 dealt with the microscale phenomena in region 2. Region 3 is the regime of wave behavior, which is described by Schrödinger’s wave equations and where quantum tunneling can occur. Quantum size effect becomes significant on thermal conductivity and specific heat. Quantum conductance is a special case of quantum tunneling, for which the ballistic processes are confined in one dimension through a channel. For very thin layers, wave interference may become important. However, due to the interface roughness, the coherence may be destroyed so that the energy ray method or the particle approach can still be applied at very small length scales. We will give a comprehensive treatment of electromagnetic wave interference and scattering phenomena in subsequent chapters, without discussing the nature of acoustic waves further. The region on the upper left is said to be of no interest at short timescales because a thermal disturbance cannot travel that far and affect the temperature field. Region 4 is designed to represent the wavelike behavior, described by the Jeffreys-type equation, Eq. (7.14). When we say Jeffreys-type equation, we mean that both k0 and k1 in Eq. (7.12a) are positive. As discussed earlier, tN is the second relaxation time for phonon scattering that does not transfer or dissipate thermal energy, as in the N processes. In this regime, the BTE based on the two-relaxation-time approximation may be applied.21,22 This regime includes the heat pulse propagation and the second sound in dielectric crystals, at

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

277

low temperatures. It suffices to say that this region, while of great academic interest, has very limited applications. The pure hyperbolic heat equation, however, predicts a nonphysical wavefront and cannot be applied without the additional diffusion term. Nevertheless, theoretical studies of the hyperbolic heat equation have helped in a better understanding of the heat transfer behavior at short timescales and, subsequently, facilitated the development of more realistic models. While the lagging heat equation can mathematically describe both wavelike behavior and parallel heat conduction, it does not provide any new physics. On the other hand, the memory concept may be related to the anharmonic and nonlinear effects that are inherent to the solid and crystal structures. Study of the thermomechanical and thermoelastic effects, and thermal transport in polymers and inhomogeneous materials, such as biological materials, may require empirical and semiempirical models. The lagging heat equation or similar differential equations may be quite helpful in these applications. Region 5 belongs to the nanoscale regime, where it is necessary to employ quantum or sometimes classical molecular dynamics to study the underlying phenomena. The dashed ellipse indicates the overlapping between different regions, where molecular dynamics simulation may provide rich information as well as a bridge between different timescales and length scales. Holland (Phys. Rev., 132, 2461, 1963) analyzed the effect of different polarizations on the thermal conductivity of germanium. Several studies have employed the Monte Carlo method to solve phonon transport equations.36,37,49 The lattice Boltzmann method has also been employed in a number of publications.14,43 Molecular dynamics has been applied to the study of TBR, including the interface between SWNTs.50–53 Chung et al. investigated the effect of different dispersion models on the lattice thermal conductivity.54 Narumanchi et al. used a finite volume method to solve the 2-D BTE during transient heat transport for a local heating source in silicon.55 They also demonstrated the feasibility to include phonon dispersion and polarization in the model. Nonequilibrium phonon transport in dimensions less than 100 nm has become an important issue in silicon-on-insulator transistors. Several studies have focused on a multiscale approach to model the thermal transport phenomena at the device level.42,43,56–58 Sinha and Goodson provided an extensive review on multiscale modeling.42 Thermal metrology includes measurements of temperature (thermometry), specific heat (calorimetry), and heat flux. Thermophysical properties, such as thermal conductivity and specific heat, can be measured with steady-state, modulated, or pulsed heating techniques. MEMS and NEMS have enabled the fabrication of miniaturized heaters and sensors. Furthermore, optical techniques such as thermoreflectance, Raman spectroscopy, photothermal radiometry, fluorescence, and laser flash techniques have been widely used in the measurement of thermal properties of nano/microstructured materials. Scanning thermal microscopy and near-field optical microscopy have further improved the spatial resolution. A large number of publications can be found from the bibliography of the present and previous chapters [see, e.g., Cahill et al.31 and references therein]. Recently, Abel et al. employed micro–Raman spectroscopy to measure the temperature distribution in silicon microstructures with a spatial resolution of 1 m.59 Lee et al. performed a steady-state characterization of heated AFM cantilevers over a range pressures for thermal metrology applications.60 Park et al. analyzed the frequency response of heated AFM cantilevers in the frequency range from 10 Hz to 1 MHz, and observed high-order harmonic responses, such as 3v, 5v, and 7v, at frequencies below 100 kHz and impedance effect at higher frequencies.61 Park et al. also investigated thermal behavior of heated cantilevers at cryogenic temperatures, down to 78 K.62 By measuring the thermal response at various frequencies, this study extracted the specific heat near the cantilever tip and the thermal conductivity along the heavily doped silicon legs, at temperatures ranging from 80 to 200 K. There appears to be a significant reduction in the thermal conductivity for the free-standing silicon cantilever, with a thickness of 0.59 m, at low temperatures. These studies demonstrate that heated AFM cantilevers have become a promising thermal analysis tool at the micro- and nanoscales.59-62

278

CHAPTER 7

7.4 SUMMARY The present chapter, along with Chaps. 5 and 6, provided a comprehensive treatment of thermal properties of and the transport processes in micro/nanostructured solid materials. This chapter focused on the transient and nonequilibrium heat conduction, when the local equilibrium condition is not satisfied to justify the conventional heat diffusion theory, based on Fourier’s law. Several modified phenomenological theories were critically reviewed with an emphasis on their application regimes. The phonon BTE was presented using the EPRT, and the solutions were discussed for the nonequilibrium heat transfer across a thin film or a multilayer structure. The basic models of TBR were outlined. Finally, a heat transfer regime was developed to assist readers in choosing an appropriate methodology for a given situation, with a brief summary on advanced multiscale modeling and measurement techniques.

REFERENCES 1. H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, 2nd ed., Clarendon Press, Oxford, 1959. 2. M. N. Özi¸sik, Heat Conduction, 2nd ed., Wiley, New York, 1993. 3. D. D. Joseph and L. Preziosi, “Heat waves,” Rev. Mod. Phys., 61, 41–73, 1989; D. D. Joseph and L. Preziosi, “Addendum to the paper ‘heat waves’,” Rev. Mod. Phys., 62, 375–391, 1990. 4. M. N. Özi¸sik and D. Y Tzou, “On the wave theory in heat conduction,” J. Heat Transfer, 116, 526–535, 1994. 5. W. K. Yeung and T. T. Lam, “A numerical scheme for non-Fourier heat conduction, Part I: Onedimensional problem formulation and applications,” Numer. Heat Transfer B, 33, 215–233, 1998. 6. A. Haji-Sheikh, W. J. Minkowycz, and E. M. Sparrow, “Certain anomalies in the analysis of hyperbolic heat conduction,” J. Heat Transfer, 124, 307–319, 2002. 7. J. Gembarovic and J. Gembarovic, Jr., “Non-Fourier heat conduction modeling in a finite medium,” Int. J. Thermophys., 25, 41261–41268, 2004. 8. M. B. Rubin, “Hyperbolic heat conduction and the second law,” Int. J. Eng. Sci., 30, 1665–1676, 1992. 9. C. Bai and A. S. Lavine, “On hyperbolic heat conduction and the second law of thermodynamics,” J. Heat Transfer, 117, 256–263, 1995. 10. A. Barletta and E. Zanchini, “Hyperbolic heat conduction and local equilibrium: A second law analysis,” Int. J. Heat Mass Transfer, 40, 1007–1016, 1997. 11. D. Jou, J. Casas-Vazquez, and G. Lebon, Extended Irreversible Thermodynamics, 2nd ed., Springer, Berlin, 1996. 12. A. Majumdar, “Microscale heat conduction in dielectric thin films,” J. Heat Transfer, 115, 7–16, 1993. 13. S. Volz, J.-B. Saulnier, M. Lallemand, B. Perrin, P. Depondt, and M. Mareschal, “Transient Fourier-law deviation by molecular dynamics in solid argon,” Phys. Rev. B, 54, 340–347, 1996. 14. J. Xu and X. Wang, “Simulation of ballistic and non-Fourier thermal transport in ultra-fast laser heating,” Physica B, 351, 213–226, 2004. 15. D. Y. Tzou, Macro- to Microscale Heat Transfer: The Lagging Behavior, Taylor & Francis, Washington DC, 1997. 16. P. J. Antaki, “Solution for non-Fourier dual phase lag heat conduction in a semi-infinite slab with surface heat flux,” Int. J. Heat Mass Transfer, 41, 2253–2258, 1998. 17. D. W. Tang and N. Araki, “Wavy, wavelike, diffusive thermal responses of finite rigid slabs to high-speed heating of laser-pulses,” Int. J. Heat Mass Transfer, 42, 855–860, 1999. 18. D. Y. Tzou and K. S. Chiu, “Temperature-dependent thermal lagging in ultrafast laser heating,” Int. J. Heat Mass Transfer, 44, 1725–1734, 2001.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

279

19. W. J. Minkowycz, A. Haji-Sheikh, and K. Vafai, “On departure from local thermal equilibrium in porous media due to a rapid changing heat source: The Sparrow number,” Int. J. Heat Mass Transfer, 42, 3373–3385, 1999. 20. W. Kaminski, “Hyperbolic heat conduction equation for materials with a nonhomogeneous inner structure,” J. Heat Transfer, 112, 555–560, 1990. 21. J. Callaway, “Model for lattice thermal conductivity at low temperatures,” Phys. Rev., 113, 1046–1951, 1959. 22. R. A. Guyer and J. A. Krumhansl, “Solution of the linearized phonon Boltzmann equation,” Phys. Rev., 148, 766–778, 1966; “Thermal conductivity, second sound, and phonon hydrodynamic phenomena in nonmetallic crystals,” Phys. Rev., 148, 778–788, 1966. 23. J. Shiomi and S. Maruyama, “Non-Fourier heat conduction in a single-walled carbon nanotube: Classical molecular dynamics simulations,” Phys. Rev. B, 73, 205420, 2006. 24. D. H. Tsai and R. A. MacDonald, “Molecular-dynamics study of second sound in a solid excited by a strong heat pulse,” Phys. Rev. B, 14, 4714–4723, 1976. 25. X. Wang and X. Xu, “Thermoelastic wave induced by pulsed laser heating,” Appl. Phys. A, 73, 107–114, 2001; X. Wang, “Thermal and thermomechanical phenomena in picosecond laser copper interaction,” J. Heat Transfer, 126, 355–364, 2004. 26. S. I. Anisimov, B. L. Kapeliovich, and T. L. Perel’man, “Electron emission from metal surfaces exposed to ultrashort laser pulses,” Sov. Phys. JETP, 39, 375–377, 1974. 27. T. Q. Qiu and C. L. Tien, “Short-pulse laser heating on metals,” Int. J. Heat Mass Transfer, 35, 719–726, 1992; T. Q. Qiu and C. L. Tien, “Size effect on nonequilibrium laser heating of metal films,” J. Heat Transfer, 115, 842–847, 1993; T. Q. Qiu, T. Juhasz, C. Suarez, W. E. Bron, and C. L. Tien, “Femtosecond laser heating of multi-layer metals—II. Experiments,” Int. J. Heat Mass Transfer, 37, 2799–2808, 1994. 28. J. L. Hostetler, A. N. Smith, D. M. Czajkowsky, and P. M. Norris, “Measurement of the electronphonon coupling factor dependence on film thickness and grain size in Au, Cr, and Al,” Appl. Opt., 38, 3614–3620, 1999. 29. S. Link, C. Burda, Z. L. Wang, and M. A. El-Sayed, “Electron dynamics in gold and gold-silver alloy nanoparticles: The influence of a nonequilibrium electron distribution and the size dependence of the electron-phonon relaxation,” J. Chem. Phys., 111, 1255–1264, 1999. 30. A. N. Smith and P. M. Norris, “Influence of intraband transition on the electron thermoreflectance response of metals,” Appl. Phys. Lett., 78, 1240–1242, 2001; R. J. Stevens, A. N. Smith, and P. M. Norris, “Measurement of thermal boundary conductance of a series of metal-dielectric interfaces by the transient thermoreflectance techniques,” J. Heat Transfer, 127, 315–322, 2005. 31. D. G. Cahill, K. Goodson, and A. Majumdar, “Thermometry and thermal transport in micro/nanoscale solid-state devices and structures,” J. Heat Transfer, 124, 223–241, 2002; D. G. Cahill, W. K. Ford, K. Goodson, et al., “Nanoscale thermal transport,” J. Appl. Phys., 93, 793–818, 2003. 32. D. M. Riffe, X. Y. Wang, M. C. Downer, et al., “Femtosecond thermionic emission from metals in the space-charge-limited regime,” J. Opt. Soc. Am. B, 10, 1424–1435, 1993. 33. A. N. Smith, J. L. Hostetler, and P. M. Norris, “Nonequilibrium heating in metal films: An analytical and numerical analysis.” Numer. Heat Transfer A, 35, 859–874, 1999. 34. D. Y. Zhou and K. S. Chiu, “Temperature-dependent thermal lagging in ultrafast laser heating,” Int. J. Heat Mass Transfer, 44, 1725–1734, 2001. 35. L. Jiang and H.-L. Tsai, “Energy transport and nanostructuring of dielectrics by femtosecond laser pulse trains,” J. Heat Transfer, 128, 926–933, 2006. 36. T. Klitsner, J. E. VanCleve, H. E. Fischer, and R. O. Pohl, “Phonon radiative heat transfer and surface scattering,” Phys. Rev. B, 38, 7576–7594, 1988. 37. R. B. Peterson, “Direct simulation of phonon-mediated heat transfer in a Debye crystal,” J. Heat Transfer, 116, 815–822, 1994. 38. E. T. Swartz and P. O. Pohl, “Thermal boundary resistance,” Rev. Mod. Phys., 61, 605–668, 1989. 39. W. A. Little, “The transport of heat between dissimilar solids at low temperatures,” Can. J. Phys., 37, 334–349, 1959. 40. G. Chen and C. L. Tien, “Thermal conductivity of quantum well structures,” J. Thermophys. Heat Transfer, 7, 311–318, 1993; G. Chen, “Size and interface effects on thermal conductivity of

280

41.

42. 43.

44. 45. 46. 47.

48.

49. 50.

51. 52. 53. 54. 55.

56. 57. 58. 59. 60.

61.

CHAPTER 7

superlattices and periodic thin-film structures,” J. Heat Transfer, 119, 220–229, 1997; G. Chen, “Thermal conductivity and ballistic-phonon transport in the cross-plane direction of superlattices,” Phys. Rev. B, 57, 14958–14973, 1998. G. Chen and T. Zeng, “Nonequilibrium phonon and electron transport in heterostructures and superlattices,” Microscale Thermophys. Eng., 5, 71–88, 2001; T. Zeng and G. Chen, “Phonon heat conduction in thin films: Impacts of thermal boundary resistance and internal heat generation,” J. Heat Transfer, 123, 340–347, 2001. S. Sinha and K. E. Goodson, “Review: Multiscale thermal modeling in nanoelectronics,” Int. J. Multiscale Comp. Eng., 3, 107–133, 2005. R. A. Escobar, S. S. Ghai, M. S. Jhon, and C. H. Amon, “Multi-length and time scale thermal transport using the lattice Boltzmann method with application to electronics cooling,” Int. J. Heat Mass Transfer, 49, 97–107, 2006. E. M. Sparrow and R. D. Cess, Radiation Heat Transfer, Augmented ed., McGraw-Hill, New York, 1978. M. F. Modest, Radiative Heat Transfer, McGraw-Hill, New York, 1993. M. M. Yovanovich, “Four decades of research on thermal contact, gap, and joint resistance in microelectronics,” IEEE Trans. Compon. Packag. Technol., 28, 182–206, 2005. P. E. Phelan, “Application of diffuse mismatch theory to the prediction of thermal boundary resistance in thin-film high-Tc superconductors,” J. Heat Transfer, 120, 37–43, 1998; L. De Bellis, P. E. Phelan, and R. S. Prasher, “Variations of acoustic and diffuse mismatch models in predicting thermal-boundary resistance,” J. Thermophys. Heat Transfer, 14, 144–150, 2000. H. J. Lee, Private communication, which provided a detailed derivation of the correct expression of the specularity p. A sequence of typos were found in Ziman’s book, Ref. [23] in Chap. 5, leading to the erroneous expression of p exp (16p3s2rms/l2). S. Mazumdar and A. Majumdar, “Monte Carlo study of phonon transport in solid thin films including dispersion and polarization,” J. Heat Transfer, 123, 749–759, 2001. C.-J. Twu and J.-R. Ho, “Molecular-dynamics study of energy flow and the Kapitza conductance across an interface with imperfection formed by two dielectric thin films,” Phys. Rev. B, 67, 205422, 2003. S. R. Phillpot, P. K. Schelling, and P. Keblinski, “Interfacial thermal conductivity: Insights from atomic level simulation,” J. Mater. Sci., 40, 3143–3148, 2005. Y. Chen, D. Li, J. R. Lukes, Z. Ni, and M. Chen, “Minimum superlattice thermal resistivity from molecular dynamics,” Phys. Rev. B, 72, 174302, 2005. H. Zhong and J. R. Lukes, “Interfacial thermal resistance between carbon nanotubes: Molecular dynamics simulations and analytical thermal modeling,” Phys. Rev. B, 74, 125403, 2006. J. D. Chung, A. J. H. McGaughey, and M. Kaviany, “Role of phonon dispersion in lattice thermal conductivity,” J. Heat Transfer, 126, 376–380, 2004. S. V. J. Narumanchi, J. Y. Murthy, and C. H. Amon, “Simulation of unsteady small heat source effects in sub-micron heat conduction,” J. Heat Transfer, 125, 896–903, 2003; S. V. J. Narumanchi, J. Y. Murthy, and C. H. Amon, “Submicron heat transport model in silicon accounting for phonon dispersion and polarization,” J. Heat Transfer, 126, 946–955, 2004. J. Lai and A. Majumdar, “Concurrent thermal and electrical modeling of sub-micrometer silicon devices,” J. Appl. Phys., 79, 7353–7361, 1996. P. G. Sverdrup, Y. S. Ju, and K. E. Goodson, “Sub-continuum simulation of heat conduction in silicon-on-insulator transistors,” J. Heat Transfer, 123, 130–137, 2001. S. Sinha, E. Pop, R. W. Dutton, and K. E. Goodson, “Non-equilibrium phonon distribution in sub100 nm silicon transistors,” J. Heat Transfer, 128, 638–647, 2006. M. R. Abel, T. L. Wright, W. P. King, and S. Graham, “Thermal metrology of silicon microstructures using Raman spectroscopy,” IEEE Trans. Comp. Pack. Technol., accepted 2007. J. Lee, T. Beechem, T. L. Wright, B. A. Nelson, S. Graham, and W. P. King, “Electrical, thermal, and, mechanical characterization of silicon microcantilever heaters,” J. Microelectromech. Syst., 15, 1644, 2007; J. Lee, T. L. Wright, M. R. Abel, et al., “Thermal conduction from microcantilever heaters in partial vacuum,” J. Appl. Phys., 101, 014906, 2007. K. Park, J. Lee, Z. M. Zhang, and W. P. King, “Frequency-dependent electrical and thermal response of heated atomic force microscope cantilevers,” J. Microelectromech. Syst., accepted 2007.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

281

62. K. Park, A. Marchenkov, Z. M. Zhang, and W.P. King, “Low temperature characterization of heated microcantilevers,” J. Appl. Phys., accepted 2007.

PROBLEMS What is the characteristic length for heat conduction along a thin film? Why is local equilibrium a good assumption in this case, even though the film thickness is less than the mean free path of the heat carriers? Why does the thermal conductivity depend on the thickness of the film? 7.2. Why do we say that Fourier’s law is a fundamental physical law, like Newton’s laws in mechanics, but the Cattaneo equation is not? Comment on the paradox of infinite speed of heat diffusion by considering the feasibility of exciting the surface temperature or depositing a heat flux to the surface instantaneously. 7.3. Consider a 1-D semi-infinite medium initially at uniform temperature Ti. The surface temperature is suddenly changed to a constant temperature, T(0,t) Ts. The analytical solution of the heat diffusion equation gives 7.1.

u(x,t)

T(x,t) Ti x Ts Ti erfca 2 2at b

For silicon at various temperatures, use the properties given in Example 5-6 to estimate how long it will take for a given location to gain a temperature rise that is u 1012, or one part per trillion of the maximum temperature difference. Estimate the average thermal diffusion speed in terms of x and Ti. [Hint: erfc(5.042) 1.00 1012.] 7.4. Repeat Problem 7.3, using copper instead of silicon as the material, based on the properties given in Example 5-5. Discuss why the average thermal diffusion speed is different under different boundary conditions, i.e., constant heat flux and constant temperature. From an engineering point of view, do you think heat diffusion is a fast or slow process? Why? 7.5. (a) Derive Eq. (7.4), the hyperbolic heat equation from the Cattaneo equation. (b) Derive Eq. (7.14), the lagging heat equation, based on the dual-phase-lag model. 7.6. Take GaAs as an example. How would you compare the speed of sound with the average thermal diffusion speed at different temperatures and length scales? This problem requires some literature search on the properties. 7.7. Assume the hyperbolic heat equation would work for transient heat transfer in glass (Pyrex), at near room temperature. Given k 1.4 W/(m # K), r 2500 kg/m3, cp 835 J/(kg # K), and va 5640 m/s. (a) At what speed would the temperature wave propagate? (b) For an excimer laser with a pulse width tp 10 ns, 0.1 ns after the pulse starts, could the hyperbolic equation be approximated by the parabolic equation? (c) Suppose we have an instrument available to probe the timescale below tq, will the hyperbolic heat equation be able to describe the observation? 7.8. Derive Eq. (7.13b) from Eq. (7.13a). Discuss the conditions for these equations to be reduced to Fourier’s law or the Cattaneo equation. 7.9. Show that Eq. (7.17) satisfies Eq. (7.16). Discuss the conditions for Eq. (7.17) to represent Fourier’s law or the Cattaneo equation. 7.10. Derive Eq. (7.18a), Eq. (7.18b), and Eq. (7.18c). 7.11. Derive Eq. (7.27a) and Eq. (7.27b). Calculate t, tq, and tT of copper, for Te 300, 1000, and 5000 K, assuming the lattice temperature Ts 300 K. 7.12. Calculate the electron-phonon coupling constant G for aluminum, copper, gold, and silver near room temperature. Discuss the dependence of k and G upon the electron and lattice temperatures Te and Ts. 7.13. At Te 1000, 3000, and 6000 K, estimate the energy transfer by thermionic emission from a copper surface, assuming that the electrons obey the equilibrium distribution function at Te. 7.14. Based on Example 7-3, evaluate the heat flux in a thin silicon film. How thin must it be in order for it to be considered as in the radiative thin limit? Calculate the medium temperature T. Plot the left-hand

282

CHAPTER 7

side and the right-hand side of Eq. (7.43). Furthermore, assuming Eq. (7.43) to be true for each frequency, find a frequency-dependent temperature T(v) of the medium. At what frequency does T(v) T ? Is there any physical significance of T(v)? 7.15. Derive Eq. (7.53), using Eq. (7.38), Eq. (7.49), and Eq. (7.50). 7.16. In principle, one should be able to study nonequilibrium electrical and thermal conduction in the direction perpendicular to the plane, and use the BTE to determine the effective conductivities. This could be a team project, in which a few students work together to formulate the necessary equations. As an individual assignment, describe how to set up the boundary conditions, as well as the steps you plan to follow, without actually deriving the equations. 7.17. For a diamond type IIa film, vl 17,500 m/s, vt 12,800 m/s, and k 3300 W/(m # K), near 300 K. Assume that the boundaries can be modeled as blackbodies for phonons. For boundary temperatures T1 350 K and T2 250 K, calculate and plot the heat flux qsx and the effective thermal conductivity keff across a film of thickness L, which varies from 0.05 to 50 m. 7.18. Calculate the TBR between high-temperature superconductor YBa2Cu3O7-d and MgO substrate, at an average temperature between 10 and 90 K, using both the AMM and the DMM without considering the electronic effect. The following parameters are given for YBa2Cu3O7-d: vl 4780 m/s, vt 3010 m/s, r 6338 kg/m3, and D 450 K; and for MgO: vl 9710 m/s, vt 6050 m/s, r 3576 kg/m3, and D 950 K. 7.19. Evaluate the effective thermal conductivity near room temperature of a GaAs/AlAs superlattice, with a total thickness of 800 nm, using the DMM to compute the transmission coefficient. Assume the end surfaces are blackbodies to phonons; consider that (a) each layer is 4 nm thick and (b) each layer is 40 nm thick. The following parameters are given, considering phonon dispersion on thermal conductivity, for GaAs: C 880 kJ/(m3 # K), vg 1024 m/s, and 145 nm; and for AlAs: C 880 kJ/(m3 # K), vg 1246 m/s, and 236 nm. How is the result compared with a single layer of either GaAs or AlAs? 7.20. Evaluate the effective thermal conductivity near room temperature of a Si/Ge superlattice, with a total thickness of 1000 nm, using the DMM to compute the transmission coefficient. Assume the end surfaces are blackbodies to phonons; consider that (a) each layer is 5 nm thick and (b) each layer is 50 nm thick. The following parameters are given, considering phonon dispersion on thermal conductivity, for Si: C 930 kJ/(m3 # K), vg 1804 m/s, and 260 nm; and for Ge: C 870 kJ/(m3 # K), vg 1042 m/s, and 199 nm. How is the result compared with a single layer of either Si or Ge?

CHAPTER 8

FUNDAMENTALS OF THERMAL RADIATION

Radiation is one of the fundamental modes of heat transfer. However, the concepts of thermal radiation are much more complicated and, hence, very difficult to perceive. The main features of radiation that are distinct from conduction and convection are as follows: (a) Radiation can transfer energy with and without an intervening medium; (b) The radiant heat flux is not proportional to the temperature gradient; (c) Radiation emission is wavelength dependent, and the radiative properties of materials depend on the wavelength and the temperature; and (d) The radiant energy exchange and the radiative properties depend on the direction and orientation.1,2 The dual theory explains the nature of radiation as either electromagnetic waves or a collection of particles called photons. Although radiation can travel in vacuum, it originates from matter. All forms of matter emit radiation through complicated mechanisms (e.g., molecular vibration in gases, and electron and lattice vibrations in solids). In most solids and some liquids, radiation emitted from the interior is strongly absorbed by adjoining molecules. Therefore, radiation from or to these materials is often treated as surface phenomena, while radiation in gases and some semitransparent solids or liquids has to be treated as volumetric phenomena. Nevertheless, one must treat solids or liquids as a medium (i.e., volumetrically) to understand the mechanisms of reflection and emission, to predict the radiative properties of thin films and small particles, and to calculate radiation heat transfer between objects placed in close vicinity. Thermal radiation refers to a type of radiation where the emission is directly related to the temperature of the body (or surface). There are numerous engineering applications where radiation heat transfer is important, such as furnaces, combustion, high-temperature materials processing and manufacturing, solar energy, space cooling and insulation, and cryogenic systems. Even at room temperature, radiative heat transfer may be of the same order of magnitude as convective heat transfer. The study of thermal radiation went along with the study of light phenomena and led to some major breakthroughs in modern physics. It is instructive to give a brief survey of major historical developments related to thermal radiation. Quantitative understanding of the nature of light began in the seventeenth century with the discoveries of Snell’s law of refraction, Fermat’s least-time principle of light path, Huygens’ principle of contructing the wavefront from secondary waves, and Newton’s prism that helped him prove white light consists of many different types of rays. In the dawn of the ninteenth century, Sir Frederick Herschel (1738–1822), a German-born English astronomer, discovered infrared radiation.3 His original objective was to find a suitable color for a glass filter, which could transmit most of light but the least amount of heat, for use in solar observations. By moving a thermometer along the spectrum of solar radiation that passed through a prism, Herschel accidently found that the temperature of the thermometer would rise even though it was placed beyond the red end of the visible light. He published several papers in Philosophical Transactions of the Royal Society of London in 283 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

284

CHAPTER 8

1800 and called the unknown radiation invisible light or heat-making rays. Young’s doubleslit experiment in 1801 demonstrated the interference phenomenon and the wave nature of light, followed by intensive studies on polarization and reflection phenomena, led by French physicist Augustin-Jean Fresnel (1788–1827) who contributed significantly to the establishment of the wave theory of light. In 1803, radiation beyond the violet end of the visible spectrum via chemical effects was also discovered. The ultraviolet, visible, and infrared spectra were thus associated with chemical, luminous, and heating effects, respectively. Yet, the common nature of the different types of radiation was not known until the late nineteenth century. One of the obstacles of accurately measuring infrared radiation (or heat radiation, as it was called in those days) was the lack of sensitive detectors. In the earlier years, measurements were performed using thermometers with blackened bulbs. In 1829, Italian physicists Leopoldo Nobili (1784–1835) and Macedonio Melloni (1798–1854) invented the thermopile, which is made by connecting a number of thermocouples in series, that is much more sensitive and faster than the thermometer. Melloni used the device to study the infrared radiation from hot objects and the sun. Gustav Kirchhoff (1824–1887), a German physicist, contributed greatly to the fundamental understanding of spectroscopy and the thermal emission by heated objects. In 1862, he coined the term “black body” radiation and established Kirchhoff’s law, which states that the emissivity of a surface equals its absorptivity at thermal equilibrium. Many famous physicists and mathmaticians have contributed to electromagnetism. The complete equations of electromagnetic waves were established in 1873 by Scottish physicist James Clerk Maxwell (1831–1879), and later confirmed experimentally by German physicist Heinrich Hertz (1857–1894), who discovered radio waves due to electrical vibrations. Before the existence of electrons had been proven, Dutch physicist Hendrik Lorentz (1853–1928) proposed that light waves were due to oscillations of an electric charge in the atom. He received the Nobel Prize in Physics in 1902, for his mathematical theory relating electron wave motion and light. The 1902 Nobel Prize was shared with his student Pieter Zeeman (1865–1943) for the experimental study about the effect of magnetic fields on atomic structures that has resulted in the splitting of spectral lines of the produced light. The electromagnetic wave theory has played a central role in radio, radar, television, microwave technology, telecommunication, thermal radiation, and physical optics. Albert Einstein arrived at the famous formula E mc2 in 1905, after connecting the relativity principle with the Maxwell equations. In 1881, Samuel Langley (1834–1906), the American astronomer, physicist, and aeronautics pioneer, invented a highly sensitive device called bolometer for detection of thermal radiation. The bolometer used two platinum strips, connected in a Wheatstone bridge circuit with a sensitive galvanometer, to read the imbalance of the bridge caused by the exposure of one of the strips to radiation. Langley was the first to make an accurate map of the solar spectrum up to a wavelength of 2.8 m. The Stefan-Boltzmann law of blackbody radiation is a result of the empirical relation obtained by Slovenian physicists Joseph Stefan (1835–1893) in 1879, based on observation of experiments, and the theoretical proof given by Austrian physicist Ludwig Boltzmann (1844–1906) in 1884, based on thermodynamic relations of a Carnot cycle with radiation as a working fluid using the concept of radiation pressure. In the late nineteenth century, German physicist Wilhelm Wien (1864–1928) derived the displacement law in 1893 by considering a piston moving within a mirrored empty cylinder filled with thermal radiation. Wien also derived a spectral distribution of blackbody radiation, called Wien’s formula, which is applicable to the short-wavelength region of the blackbody spectrum but deviates toward long wavelengths. Wien received the Nobel Prize in 1911 “for his discoveries regarding the laws governing the radiation of heat.” In 1900, Lord Rayleigh (1842–1919), British physicist and Nobel Laureate in Physics in 1904, used the equipartition theorem to show that the blackbody emission should be directly proportional to temperature but inversely proportional to the fourth power of wavelength. Sir James Jeans (1877–1946), British physicist, astronomer, and mathematician,

FUNDAMENTALS OF THERMAL RADIATION

285

derived a more complete expression in 1905. The Rayleigh-Jeans formula agreed with experiments at sufficiently high temperatures and long wavelengths, where Wien’s formula failed, but disagreed with experiments at short wavelengths. It is noteworthy that Rayleigh made great contributions to light scattering and wave phenomena, such as the discovery of Rayleigh scattering by small objects that explains why the sky is blue and the sunset appears orange glow. Rayleigh also predicted the existence of surface waves, sometimes called Rayleigh waves, which propagate along the interface between two different media. The amplitude of the wave, however, reduced in each media as the distance from the interface increases. In an effort to obtain a better agreement with measurements at long wavelengths, German physicist Max Planck (1858–1947) in 1900 used the maximum entropy principle, based on Boltzmann’s entropy expression, to derive an equation, known as Planck’s law, which agrees with experiments in the whole spectral region. Planck obtained his expression independently of Rayleigh’s work published several months earlier, while the complete derivation of Rayleigh-Jeans formula was obtained several years later. In his book The Theory of Heat Radiation, Planck showed that his formula would reduce to Wien’s formula at small lT and Rayleigh-Jeans formula at very large lT.4 In his derivation, Planck used a bold assumption that is controversial to classical electrodynamics. His hypothesis was that energy is not infinitely divisible but must assume discrete values, which are proportional to the frequency. This concept would have been easily accepted for a system consisting of particles, like atoms or gas molecules, but not for oscillators that radiate electromagnetic energy. Planck’s work opened the door to quantum mechanics. The idea of quantization of radiation was further developed by Einstein, who applied it to explain the photoelectric effect in 1905. Planck was awarded the Nobel Prize in Physics in 1918 for the discovery of energy quanta. In 1924, Indian mathematical physicist Satyendra Nath Bose (1894–1974) modified the Boltzmann statistics of ideal molecular gases, by treating photons as indistinguishable particles in order to derive Planck’s distribution function. With the help of Einstein, Bose’s work was published in Zeitschrift für Physik in 1924. Einstein further extended Bose’s theory to atoms and predicted the existence of a phenomenon, known as Bose-Einstein condensate, as discussed in Chap. 3. It is clear that the path of quest for the truth in understanding thermal radiation has led to important discoveries in modern physics. This chapter contains an introduction to the electromagnetic wave theory, blackbody radiation, plane wave reflection and refraction at the boundary between two semi-infinite media, and various models used to study the optical properties of different materials. The materials covered in the following sections are intended to provide a detailed background for more in-depth discussion on the applications to micro/nanosystems in subsequent chapters.

8.1 ELECTROMAGNETIC WAVES 8.1.1 Maxwell’s Equations The propagation of electromagnetic waves in any media is governed by a set of equations, first stated together by Maxwell. The macroscopic Maxwell equations can be written in the differential forms as follows:5–7 =E

'B 't

=HJ

= # D re =#B0

'D 't

(8.1) (8.2) (8.3) (8.4)

286

CHAPTER 8

Based on the SI units, E in V/m is the electric field, H in A/m is the magnetic field, J in A/m2 is the electric current density (i.e., electric charge flux), D in C/m2 is the electric displacement, B in Wb/m2 is the magnetic flux density (also called magnetic induction), and re in C/m3 is the charge density. Note that in magnetism, 1 tesla (T) 1 Wb/m2, and 1 weber (Wb) 1 V s. The charge conservation or continuity equation, = # J 're /'t 0, is implicitly included in the Maxwell equations, because it can be obtained by taking the divergence of Eq. (8.2) and then applying Eq. (8.3). The constitutive relations for a linear isotropic medium are D emE

(8.5)

B mmH

(8.6)

where em in F/m is the electric permittivity and mm in N/A2 is the magnetic permeability of the medium. Note that farad (F) is the SI unit of capacitance: 1 F 1 C/V. The permittivity and permeability values of free space (vacuum) are e0 8.854 1012 F/m and m0 4p 107 N/A2, respectively. For anisotropic media, em and mm are dyadic tensors. The microscopic form of Ohm’s law gives J sE

(8.7)

where s in A/(V m) is the electric conductivity. A brief discussion on the physical interpretation of Maxwell’s equations is given next. Equation (8.1) is an expression of Faraday’s law of induction, which states that a time varying magnetic field produces an electric field in a coil. In other words, through any closed electric field line, there is a time varying magnetic field. Combining Eq. (8.1) with Green’s theorem, Eq. (B.71), we see that the integral of the electric field around a closed loop is equal to the negative of the integral of the time derivative of the magnetic induction, over the area enclosed by the loop. Equation (8.2) is the general Ampere law, which includes Maxwell’s displacement current ('D/'t). It states that through any closed magnetic field line, there is an electric current density J or a displacement current or both. Conversely, circulating magnetic fields are produced by passing an electrical current through a conductor or changing electric fields or both. Equation (8.3) is Gauss’s law, which implies that the electric field diverges from electric charges. Using Gauss’s theorem, Eq. (B.70), it can be seen from Eq. (8.3) that the integral of the electric field over a closed surface is proportional to the electric charges enclosed by that surface. If there are no electric charges inside a closed surface, there is no net electric field penetrating the surface. Equation (8.4) is an analogy to Gauss’s law for magnetic field. However, since there exist no isolated magnetic poles, called magnetic monopoles, the integration of magnetic field over any closed surface is zero. The interpretations given in the preceding paragraph are straightforward since all variables and coefficients are considered as real quantities. However, Maxwell’s equations are mostly useful when all quantities are expressed in complex variables. The material properties, such as em and mm, are generally complex and frequency dependent. To facilitate the understanding, we will start with simple cases first and then generalize the theory for more realistic problems. 8.1.2 The Wave Equation Sometimes called free charge density, re in Eq. (8.3) should be treated as excess charges or net charges per unit volume. Because the number of electrons equals the number of protons in the nuclei, in most media, we can assume re 0. For a nonconductive material, s 0. We further assume that em and mm are both real and independent of position, time, and the field strength. This is true for a nondissipative (lossless), homogeneous, and linear material. If mm m0, the material is said to be nonmagnetic. Therefore, a nonconductive and

FUNDAMENTALS OF THERMAL RADIATION

287

nonmagnetic material is a dielectric for which only em is needed to characterize its electromagnetic behavior. Materials with both em and mm being real but mm 2 m0 are sometimes called general dielectrics or dielectric-magnetic media. Substituting the constitutive relations into Maxwell’s equations and then combining Eq. (8.1) and Eq. (8.2), we obtain =2E mmem

'2E 't2

(8.8)

where the vector identity given in Eq. (B.64), = (= E) =(= # E) =2E =2E, has been employed. Equation (8.8) is the wave equation, which can also be written in terms of the magnetic field. The wave equation has infinite number of solutions (see Problem 8.1). The solution of Eq. (8.8) for a monochromatic plane wave can be written as E E0ei(vtk # r)

(8.9)

where E0 is the amplitude vector, v is the angular frequency, r xx^ yy^ zz^ is the position vector, and k kx ^x ky ^y kz ^z is the wavevector, which points toward the direction of propagation. In order for Eq. (8.9) to be a solution of Eq. (8.8), the magnitude of k must be k v!mmem. The complex form of the electric field is used in Eq. (8.9) to facilitate mathematical manipulation. The actual electric field may be expressed as the real part of Eq. (8.9), viz., Re(E) Re(E0)cos f Im(E0)sin f

(8.10)

where Re or Im stands for taking the real part or the imaginary part, and f vt k # r is the phase. Equation (8.9) is a time-harmonic solution at a fixed frequency. Because any time-space-dependent function can be expressed as a Fourier series of many frequency components, we can integrate Eq. (8.9) over all frequencies to obtain the total electric field at any time and position. Therefore, understanding the nature of Eq. (8.9) is very important to the study of electromagnetic wave phenomena. When Eq. (8.9) is substituted into Maxwell’s equations, a time derivative '/'t can be replaced by a multiplication of iv and the operator = can be replaced by ik. Hence, the first two Maxwell equations can be written as k E vmmH and

k H vemE

(8.11a) (8.11b)

The two equations suggest that E, H, and k are orthogonal and form a right-handed triplet, when both em and mm are positive. On the surface normal to the wavevector k, the electric or magnetic field is a function of time only, because k # r const. This surface is called a wavefront. In the k direction, the wavefront travels at the speed given by c

v 1 k 2mmem

(8.12)

which is called phase speed, and it is the smallest speed at which the phase of the wave propagates.8 The phase velocity is the phase speed times the unit wavevector. Figure 8.1 illustrates a plane wave, propagating in the positive x direction, whose electric field is parallel to the y direction and magnetic field parallel to the z direction. In such cases, k kx and k # r kx. The wavefront is perpendicular to the x direction. It can be seen clearly that the wavevector is related to the wavelength lm in the medium by k 2p/lm. In free space, the speed of electromagnetic wave is given by c0 1/!m0e0. The speed of light in vacuum was instated as an exact number, c0 299, 792, 458 m/s, by the General Conference on Weights and Measures (abbreviated as CGPM for Conférence Générale des

CHAPTER 8

288

FIGURE 8.1 Illustration of a linearly polarized electromagnetic wave.

Poids et Mesures) in 1983. The SI base unit meter has since been defined as the distance that light travels in vacuum during a time interval of 1/299,792,458 s. The NIST reference on constant, units, and uncertainty can be found on the web page: http://physics.nist.gov/cuu/ index.html, which contains detailed discussions about the fundamental physical constants and the base SI units. For most calculations, it suffices to use c0 2.998 108 m/s. The refractive index of the medium is given as n !mmem /m0e0 c0 /c. Therefore, c c0 /n and lm l/n, where l is the wavelength in vacuum. For nonmagnetic materials mm /m0 1 ; thus, n !em /e0. Notice that n of a medium is a function of frequency (or wavelength) and is in general temperature dependent. For polychromatic light, the phase speed usually depends on wavelength because n n(l) in a dispersive medium. In vacuum, the energy propagation velocity is the same as the phase velocity. For polychromatic waves in a dispersive medium, the group velocity vg determines the direction and speed of energy flow and is defined as vg =kv

dv 'v ^ 'v ^ 'v ^ x

y

z dk 'kx 'ky 'kz

(8.13)

which is the gradient of v in the k-space. In a homogeneous and isotropic medium, vg c0 /(n v dn/dv) and the direction of the group velocity will be the same as that of the wavevector k. In a nondispersive medium, where n is not a function of frequency, it is clear that vg c c0 /n. When light is refracted from a nondispersive medium to a dispersive medium, the group velocity can have a component parallel to the group fronts, and hence, the energy flow is not necessarily perpendicular to the group fronts.8 Notice that the wave equation is also applicable to other types of waves such as acoustic waves, which are matter waves with a longitudinal and two transverse modes, as mentioned in Chap. 5. 8.1.3 Polarization A simple transverse wave will oscillate perpendicular to the wavevector. Because electromagnetic waves have two field vectors that can change their directions during propagation, the polarization behavior may be complicated. It is important to understand the nature of polarization in order to fully characterize an electromagnetic wave. There are two equivalent ways to interpret a complex vector A. The first method considers it as a vector whose components are complex, i.e., A Axx^ Ayy^ Az^z

(8.14)

FUNDAMENTALS OF THERMAL RADIATION

289

where Ax, Ay, and Az are complex numbers: Ax Arx iAsx ,

Ay Ary iAsy , and

Az Arz iAsz

(8.14a)

The second method decomposes it into two real vectors such that A Ar iAs

(8.15)

where Ar and As are the real and imaginary parts of the complex vector, given by Ar Arx x^ Ary y^ Arz ^z

and

As Asx x^ Asy y^ Asz ^z

(8.15a)

In either case, a complex vector has six real scalar terms. For the time being, let us assume all the material properties to have real values and k to be a real vector. Both E and H are complex, according to Eq. (8.9). To ensure that k # E 0 at any time and location, both Re(E0) and Im(E0) must be perpendicular to k. The same is true for the magnetic vector. Because H can be obtained from Eq. (8.11a), the state of polarization can be based on how the electric field varies in time and along the k direction in space. In order to study the time dependence of the electric field, rewrite Eq. (8.10) as Re(E) a cos(vt) b sin(vt)

(8.16)

where a Re(E0eik # r) and b Im(E0eik # r) are both real vectors and perpendicular to k. In general, the electric field will vary with time in an ellipse, called the vibration ellipse, as shown in Fig. 8.2. If a and b are parallel or, equivalently, Re(E0) and Im(E0) are parallel to

FIGURE 8.2 Illustration of polarization by the vibration ellipse, for a plane wave propagating in the positive z direction (out of the paper). The electric field vector is plotted at an increment of v t p/12.

each other, then the electric field will not change its directions. The wave is said to be linearly polarized, and either a or b specifies the direction of polarization. An example of a linearly polarized wave is the wave shown in Fig. 8.1. When a ' b and ZaZ ZbZ , the vibration ellipse is a circle and the wave is said to be circularly polarized. In general, a monochromatic wave

CHAPTER 8

290

described by Eq. (8.10) is elliptically polarized. For circularly or elliptically polarized light, if a b is in the same direction as k, the vibration ellipse will rotate counterclockwise (left-handed), as viewed toward the light source; and if a b is opposite to the direction of propagation, the vibration ellipse will rotate clockwise (right-handed).6,7 Similarly, one can consider the polarization of the electric field at a fixed time, and observe the vibration ellipse along the direction of propagation as an exercise (see Problem 8.2). Because of the random nature of thermal radiation, the Fourier component does not vary with time exactly following eivt but with some fluctuations in the amplitude. The polarization may become completely random, which is said to be unpolarized, randomly polarized, or completely uncorrelated. In any case, the electric field can be decomposed into the two orthogonal directions on the vibration ellipse. This is particularly useful for calculating energy transfer. A complete description of polarization is based on Stokes parameters, which are important in the study of light scattering and will be discussed in Chap. 9.

8.1.4 Energy Flux and Density The energy conservation for electromagnetic field can be obtained from Maxwell’s equations, according to English physicist John Poynting (1852–1914). To derive Poynting’s theorem, one can dot multiply Eq. (8.1) and Eq. (8.2) by H and E, respectively, and then add up each side. Using the vector identity in Eq. (B.63), we have = # (E H) (= E) # H (= H) # E. After simplifications, we obtain ' 1 1 = # (E H) a em E # E mmH # Hb E # J 't 2 2

(8.17)

The left-hand term represents the energy flow into a differential control volume, the first term on the right is the rate of change of the stored energy (associated with the electric and magnetic fields), and the last term is the dissipated electromagnetic work or Joule heating. The Poynting vector is defined as SEH

(8.18a)

The Poynting vector is essentially the energy flux, which gives both the direction and the rate of energy flow per unit projected surface area. Equation (8.17) and Eq. (8.18a) can be easily extended to the complex field notation. Although it is easy to write the Poynting vector (which is always real) as S Re(E) Re(H), it is not very helpful because one would have to evaluate the real parts of E and H individually. Besides, the frequency of oscillation is usually too high to be measured. For harmonic fields, the time-averaged Poynting vector can be expressed as kSl

1 Re(E H*) 2

(8.18b)

where * signifies the complex conjugate. Similarly, the time-averaged energy density for time-harmonic fields can be expressed as5 kul

1 1 e E # E* mmH # H* 4 m 4

(8.19)

For an absorbing or dissipative medium, a more complete description of the energy density can be found in Cui and Kong (Phys. Rev. B, 70, 205106, 2004).

FUNDAMENTALS OF THERMAL RADIATION

Example 8-1.

291

Prove that Eq. (8.18b) is the time-averaged Poynting vector for time-harmonic

fields. Let E E(r)eivt and H H(r)eivt, where E(r) and H(r) are complex vectors. Integrating the Poynting vector over a period T, we have Solution.

kSl

1 Re(E) Re(H)dt T 3T 1 SE(r)eivt E*(r)eivt T SH(r)eivt H*(r)eivt Tdt 4T 3T

1 1 (E H* E* H) Re(E H*) 4 2

8.1.5 Dielectric Function The conductivity is large at low frequencies for metals, due to free electrons. Even for good conductors, however, the electrons are not completely free but will be scattered by defects and phonons. At high frequencies, the current density J and the electric field E are not in phase anymore, suggesting that the conductivity should be a complex number. For insulators such as crystalline or amorphous dielectrics, electromagnetic waves can interact with bound electrons or lattice vibrations to transfer energy to the medium. At optical frequencies, the distinction between a conductor and an insulator becomes ambiguous unless the optical response over a large frequency region is considered. For example, a dielectric material can be highly reflective at a certain frequency region in the mid-infrared. On the other hand, a good conductor will be highly reflective in a much broader wavelength region from the near-infrared to the microwave. Let us first take the conductivity and the permittivity to be real, for a nonmagnetic material. The wave equation for s 2 0 and mm m0 has the following form: =2E m0s

'E '2E

m0em 2 't 't

(8.20)

Suppose Eq. (8.9) is a solution of this equation. We can substitute 'E/'t ivE, '2E/'t2 v2E, and =2E k2E into Eq. (8.20) to obtain k2 ivm0s v2m0em

(8.21)

Therefore, the wavevector becomes complex: k kr iks , where kr krx ^x kry y^ krz ^z and ks ksx ^x ksy y^ ksz ^z are real vectors. Note that Eq. (8.21) tells us the value of k2 k # k k2x k2y k2z , where each wavevector component may be complex, but does not specify the individual components. The complex dielectric function is defined as em s e er ies e i ve 0 0

(8.22)

For a nonmagnetic material, the complex refractive index ~ n n ik is related to the complex dielectric function by e (n ik)2. The imaginary part k of the complex refractive index is called the extinction coefficient. By definition, we have er n2 k2 and

es 2nk

(8.23)

The refractive index n and the extinction coefficient k are also called optical constants,9 although none of them are constant over a large wavelength region for real materials. The

292

CHAPTER 8

dielectric function is also called relative permittivity, with respect to the permittivity of vacuum e0. One can consider the s/v term in Eq. (8.22) as the imaginary part of the permittivity. Some texts used e er ies for the dielectric function and ~ n n ik for the complex refractive index. In doing so, Eq. (8.9) must be revised to E E0ei(vtk # r). In either convention, e and s must be nonnegative for a passive medium. Equation (8.21) can be rewritten as k~ nv/c0 (8.24) For simplicity, we will remove the tilde and simply use n for the complex refractive index, where it can be clearly understood from the context. By substituting ik for = and iv for '/'t, we can rewrite Maxwell’s curl equations as

and

k E vm0H

(8.25)

k H ve0eE

(8.26)

Similar to the definition of the complex dielectric function, one may choose to define a ~ E, where complex conductivity that satisfies Ohm’s law at high frequencies, J s ~ sr iss s ive s m

(8.27)

~ . Therefore, because we have assumed that s is the real part of s ss ve0er

and

es sr/ve0

Equation (8.26) can be recast in terms of the complex conductivity as ~E k H is

(8.28)

(8.29)

In the subsequent discussion, we will omit the tilde above s, when the context is sufficiently clear. The complex conductivity and the complex dielectric function are related to each other. For a linear, isotropic, and homogeneous nonmagnetic material, only two frequencydependent functions are needed to fully characterize the electromagnetic response. The function pairs often found in the literature are (n,k), (sr,er ), (er,es) , and (sr,ss) . The principle of causality, which states that the effect cannot precede the cause, or no output before an input, imposes additional restrictions on the frequency dependence of the optical properties so that the real and imaginary parts are not completely independent, but related to each other. In general, the relative permeability, which is complex and frequency dependent, can be expressed as m mr ims mm/m0

(8.30)

The complex refractive index for magnetic materials should be defined as follows: n 2em

(8.31)

The amplitude of the complex wavevector is k nv/c0, the same as Eq. (8.24). One can verify that Eq. (8.9) is a solution of the wave equation. The relative permittivity e and permeability m will be used to formulate the general equations later in this chapter. In most sections of this chapter, we deal with nonmagnetic materials, such as metals, dielectrics, and semiconductors. However, we will devote the discussion of the optical properties of magnetic materials in Sec. 8.4.6, because of the emerging interest in metamaterials, which are synthesized materials with magnetic response at the microwave and higher frequencies (see Problem 8.6, for example).

FUNDAMENTALS OF THERMAL RADIATION

293

8.1.6 Propagating and Evanescent Waves In an absorbing nonmagnetic medium, the electric and magnetic fields will attenuate exponentially. As an example, consider a wave that propagates in the positive x direction, with its electric field polarized in the y direction. Then, E y^E0ei(vtkrx)eksx

(8.32)

where kr vn/c0 and ks vk/c0 are the real and imaginary parts of the wavevector, respectively; that is, k (kr iks)x^. Equation (8.32) suggests that the amplitude of the electric field will decay exponentially according to e(2pk/l)x. The magnetic field can be obtained from Eq. (8.25) as n ik H z^ m c E0ei(vtkrx)eksx 0 0

(8.33)

By substituting Eq. (8.32) and Eq. (8.33) into Eq. (8.18b), we obtain the time-averaged energy flux in the x direction as kSl

n n E 2e2ksx E 2ealx 2m0c0 0 2m0c0 0

(8.34)

where al 4pk/l is called the absorption coefficient. The inverse of al is called the radiation penetration depth (or photon mean free path) given by l 1 dl a 4pk l

(8.35)

It is the distance through which the radiation power is attenuated by a factor of 1/e (1lmp eb,l(l,T)dl. The numerical result is approximately 1:3 and independent of lmp

`

temperature. For a medium of refractive index n, the speed of light c should be replaced by c0 /n in Eq. (8.43). In the previous discussion, we have assumed a nondispersive medium with n ; 1, which is true for vacuum only. Corrections are rarely needed if the medium is a gas, but would be necessary for radiation inside solids or liquids. Furthermore, in a dispersive medium, the group velocity needs to be considered in deriving the density of states D(n) in Eq. (8.40) and the energy flux in Eq. (8.42); see Prasher (Appl. Phys. Lett., 86, 071914, 2005). Assuming the sun to be a blackbody at 5800 K, calculate the emitted power at the following wavelength intervals: l 0.3 µm, 0.3 m l 0.4 µm, 0.4 µm l 0.7 µm, 0.7 µm l 3 µm, and l 3 µm. Neglect the absorption by the atmosphere. What is the radiant power arriving at the earth’s surface from the sun? Example 8-3.

4 Solution. The total emissive power is sSBTsun 5.67 108 58004 < 64 MW/m2. We can obtain the emitted power in each spectral region by integrating Eq. (8.44), as listed in the following table. Note that Fl1Sl2 represents the fraction of radiation falling between l1 and l2.

l(µm) l2T (µm K) F0Sl2 Fl1Sl2

Eb(MW/m2)

0.3 1740 0.03 0.03 1.9

0.3–0.4 2320 0.12 0.09 5.8

0.4–0.7 4060 0.49 0.37 23.7

0.7–3 17400 0.98 0.49 31.4

3 ∞ 1 0.02 1.3

Total _ _ 1 64.1

The total power emitted by the sun equals the emissive power multiplied by the surface area of the sun. The fraction of the power that reaches the earth equals the solid angle of the earth divided by 4p. Note that the radius of the sun rsun 6.955 108 m, the radius of the earth rearth 6.378 106 m, and the earth-sun distance Rearth-sun 1.496 1011 m. Therefore, the total power that will reach the earth’s surface, if the absorption by the atmosphere is neglected, is . 2 # 4 # Q 4prsun sSBTsun

2 prearth 2 4pRearth-sun

< 1.8 1017 W

. 2 The average irradiation on the earth is: G Q/prearth < 1377 W/m2. This value is very close to the total solar irradiance (TSI), measured outside the earth’s atmosphere.

Because of the broad spectral region of electromagnetic waves, alternative units are often used, such as wavelength l (in vacuum), wavenumber n 1/l, frequency n c0/l, angular frequency v 2pn, and photon energy E hn. Generally speaking, optical radiation covers the spectral region of ultraviolet (UV), visible (VIS), near-infrared (NIR),

CHAPTER 8

298

TABLE 8.1 Spectral Regions Expressed in Different Units UV from–to

VIS up to

Wavelength, l (m)* 0.01–0.38 0.76 Wavenumber, n (cm1) 106–(2.6 104) 1.3 104 Frequency, n (THz) (3 104)–790 395 Angular frequency, v (2 105)–(5 103) 2.5 103 (rad/s) Photon energy, E (eV)† 124–3.3 1.63

NIR up to

MIR up to

FIR up to

MW up to

2.5 4 103 120 750

25 400 12 75

103 10 0.3 1.9

105 0.1 3 103 0.02

0.5

0.05

1.2 103 1.2 105

*

The wavelength will be reduced in a medium whose refractive index n is not unity. The conversion from the vacuum wavelength l in m to the photon energy E in eV is E 1.240/l.

†

mid-infrared (MIR), and far-infrared (FIR). Table 8.1 outlines the subdivisions of the spectral region in different units from ultraviolet (UV) to microwave (MW).

8.2.2 Radiation Thermometry The developments of the absolute temperature scale and radiation thermometry are among the most important applications of blackbody radiation. The Stefan-Boltzmann law eb sSBT 4 defines an absolute thermodynamic temperature, which is consistent with the one defined by the ideal-gas law and the Carnot cycle. While radiation thermometry can serve as a primary standard, most practical radiation thermometers are not absolute instruments because of other considerations such as fast response, easy operation, and low cost. Hightemperature furnaces are commonly used as calibration standards. The cavity is a hollow cylinder, made of graphite for example, with a conical ending and a small aperture. The most accurate calibration source is the fixed-point heat pipe blackbody, for which a pure metal is melted outside the graphite cylinder to maintain a constant temperature in a twophase state. The freezing temperatures are then used to define the temperature scales (1234.93 K for Ag, 1337.33 for Au, and 1357.77 K for Cu). To measure the absolute temperature of a thermally radiative body, two blackbody cavities at different temperatures would be needed: one serves as the emitter (blackbody source) and the other as the receiver (radiometer). Quinn and Martin used a blackbody source and a cryogenic radiometer to directly determine the thermodynamic temperatures and measure the Stefan-Boltzmann constant.10 The experimentally obtained Stefan-Boltzmann constant was (5.66967 0.00076) 108 W/(m2 # K4). The difference is 0.13% of the theoretical value (5.67040 0.00004) 108 W/(m2 # K4), based on Planck’s constant, Boltzmann’s constant, and the speed of light. Since the early 1990s, the National Institute of Standards and Technology (NIST) has developed a high-accuracy cryogenic radiometer (HACR) facility to serve as the primary standard for optical radiation measurements. A schematic of the original HACR receiver is shown in Fig. 8.6. The receiver is mounted at the bottom of a liquid-helium cryostat in an evacuated chamber, and the optical access is through a Brewster window below the cavity. The HACR facility has gone through some major upgrades in recent years. The receiver cavity is made of copper with a high thermal conductivity and low specific heat at cryogenic temperatures. The inner wall of the cavity is coated with a specular black paint to absorb the incident radiation with an effective absorptance greater than 99.998%. The electrical-substitution technique links the radiant power to the electric power to achieve an overall uncertainty within 0.02% for optical power measurements. Detailed descriptions can be found from Pearson and Zhang and references therein.11 The cosmic microwave background radiation, measured with cryogenic bolometers, can be

FUNDAMENTALS OF THERMAL RADIATION

299

FIGURE 8.6 Schematic of the receiver cavity of an absolute cryogenic radiometer, where GRT stands for germanium resistance thermometer, from Pearson and Zhang.11

fitted to the blackbody distribution at 2.7 K, which is the temperature of the universe at the present time. The discovery of cosmic radiation background in 1964 and the subsequent measurements and theoretical studies have been recognized by the Nobel Prizes in Physics to Arno Penzias and Robert Wilson in 1978 and to John Mather and George Smoot in 2006. Most radiation thermometers are based on spectral measurements rather than on the measurement of the total irradiance from the target. When a radiation thermometer is used to measure the temperature of a real surface, the unknown emissivity of the surface and the influence of the surrounding radiation are the major issues that affect the measurement. Various methods have been developed to deal with these problems, including the creation of a blackbody cavity on the surface, the two-color method, and the use of a

300

CHAPTER 8

controlled reference source.12 The development of optical fibers has allowed radiometric temperature measurements for surface locations that are otherwise inaccessible by imaging radiometers. The measurement equation of a spectral radiation thermometer can be approximated as follows: Vd CI Iex,l(l)

(8.46)

where Vd is the detector output signal and CI is an instrument constant that is independent of the target material and temperature. The term Iex,l(l) is called the exitent spectral radiance, which includes the radiation emitted by the target and the surroundings, as well as that reflected by the target. The radiance temperature Tl (also called the brightness temperature) is defined according to Ib,l(l,Tl) Iex,l(l)

(8.47)

where Ib,l(l,Tl) is the blackbody intensity at the wavelength l and temperature Tl. If the surrounding emission and absorption can be neglected, the exitent spectral radiance is due only to the emission; therefore, Iex,l(l) Ie,l(l,T ) erl Ib,l(l,T )

(8.48)

where erl is the directional-spectral emissivity, and Ie,l(l,T ) is the intensity emitted by the target. By combining Eq. (8.47) and Eq. (8.48) and applying Wien’s formula, the surface temperature is related to the radiance temperature by l 1 1 ln erl

T Tl C2

(8.49)

The uncertainty in the measured temperature due to an uncertainty in the emissivity is lT derl dT T C2 erl

(8.50)

The effect of the emissivity uncertainty on the temperature accuracy decreases as l decreases. However, the wavelength at which Ib,l(l,T ) is a maximum is given by Wien’s displacement law. In practice, the choice of the operating wavelength should also be based on the material’s properties and the surrounding radiation, and requires a detailed analysis of different effects. If the surrounding radiation is not negligible, Iex,l(l) is the sum of the emitted and reflected spectral radiances, and may be affected by participating medium emission and absorption. Rapid thermal processing is a semiconductor single-wafer manufacturing technique. Lightpipe radiation thermometer, at l 0.95 m, is used to measure the wafer temperature. The emissivity of a plain silicon wafer is approximately 0.7 at this wavelength. Neglect the reflected radiation from the wafer. If the wafer is at a temperature of 1200 K, what is the radiance temperature? If the temperature needs to be determined within an uncertainty of 1 K, how much tolerance on the emissivity error is acceptable? Example 8-4.

Solution. From Eq. (8.49), Tl < 1167 K, which differs from the actual temperature by approximately 33 K. One can also solve Eq. (8.47) and Eq. (8.48), using Planck’s law, and the result is essentially the same. Based on Eq. (8.50), to obtain a temperature within an uncertainty of 1 K, the emissivity must be determined within an uncertainty of derl 0.0074. Zhou et al. (Int. J. Heat Mass Transfer, 45, 1945, 2002) developed a model to predict the effective emissivity of silicon wafers in rapid thermal processing furnaces and showed that the temperature measurement uncertainty can be significantly reduced by using a reflective cavity.

FUNDAMENTALS OF THERMAL RADIATION

301

8.2.3 Entropy and Radiation Pressure Like other particles, the photon gas also has the property of entropy and can be related to other properties in equilibrium states. Express the energy density in an enclosure of volume V, at thermodynamic equilibrium, with a temperature T as u U/V 4sSBT 4/c. It can be seen that the specific heat at constant volume is cv ('u/'T )V 16sSBT 3/c. The entropy can therefore be obtained as T

16 dT Vs T 3 S 3 Vcv T 3c SB 0 or

s

16 s T3 3c SB

(8.51a) (8.51b)

Note that T ('U/'S)V is satisfied. The Helmholtz free energy A U TS 34 VsSBT 4/c. Thus, the radiation pressure is P a

'A 4 b sSBT 4 'V T 3c

(8.52)

The force by the radiation pressure, albeit small, has some important applications in trapping and manipulating atomic and molecular particles. This technique is called optical traps or optical tweezers; see Lang and Block (Am. J. Phys., 71, 201, 2003) for a bibliographical review. If each photon mode (frequency) is individually considered, the spectral entropy density for unpolarized radiation can be expressed as follows: sn(n,T )

8pkBn2 3

c

c

x ex

lna x bd e 1 e 1 x

(8.53)

where x hn/kBT 4. Note that 1/T ('sv /'uv)n (kB /hn) ln (1 8phn3/uvc3), which is consistent with Eq. (8.41). Similar to the energy flux (emissive power) and intensity, the radiation entropy flux can be obtained by multiplying a factor c/4 to Eq. (8.51b) and Eq. (8.53), and the radiation entropy intensity can be obtained by dividing the flux by p, because of the isotropic nature of blackbody radiation. Clearly, electromagnetic radiation carries both energy and entropy. Consider the radiation heat transfer between two parallel plates at T1 and T2, respectively. Assume each plate has an area of A and both plates are blackbodies. The separation distance is much smaller than !A but much greater than the wavelength of thermal radiation. (a) How much entropy is generated at each plate? Evaluate the ratio of entropy generation assuming that T1 2T2. (b) If a thermophotovoltaic receiver is mounted on the lower-temperature side to convert thermal radiative energy to electricity (work), what is its maximum achievable efficiency? . Solution. (a) The net energy flow from. plate 1 to 2 is Q12 AsSB(T14 T24). The entropy of plate 1 will decrease at the rate of dS1/dt Q12/T1, and the entropy of plate 2 will increase at the rate of . dS2/dt Q12/T2. On the other hand, the net entropy flow from plate 1 to 2 can be calculated as . . . . . 1 S12 43 AsSB(T 13 T 23). Therefore, Sgen,1 Q12 / T1 S12 AsSB A 3 T13 34 T23 T24/T1 B , Sgen,2 . 1 3 4 3 AsSB A 3 T2 3 T1 T14/T2 B , and the combined total entropy generation is equal to Q12(1/T2 1/T1), as expected. It can be shown that the entropy generation at each plate is always greater than zero if T1 2 T2, or equal to zero if T1 T2. When T1 2T2, the entropy generation by plate 1 is about onequarter and that by plate 2 is about three-quarters of the total entropy generated. Example 8-5.

CHAPTER 8

302

(b) The available energy or exergy of thermal radiation is defined as the maximum work that can be produced by a system with respect to a large reservoir. In the present example, we may assume that the reservoir is at the same temperature as T2. Suppose an amount of heat is taken from the hightemperature plate; we would like to find out the maximum work that can possibly be produced. Let us consider a reversible heat engine at T2. The radiative energy leaving surface 1 can still be described . . 4 by Q1 AsSB (T 41 T 42 ), and the entropy leaving surface 1 is S1 3 AsSB(T 13 T 32). Therefore, the entropy generation in plate 1 cannot be eliminated. In other words, it is impossible to achieve the Carnot efficiency of Carnot 1 T2/T1. The maximum work can be obtained when the irreversibility at the lower-temperature plate is negligible . and the . heat engine is also reversible. It can easily be shown that the maximum work Wmax Q1 T2S1, and the optimal efficiency is given by opt

. Wmax 4(1 y y2) . 1 3(1

y)(1 y2) Q1

(8.54)

where y T1/T2 0. When y 2, we obtain an optimal efficiency h opt 37.8% , which is less than the Carnot efficiency of 50%, because of the unrecoverable irreversibility at plate 1. A comprehensive discussion can be found from the review of Landsberg and Tonge.13

The next question is whether temperature can be defined for laser radiation. The answer is yes, and the temperature for high-intensity lasers can be very high. An intuitive guess is to define the temperature, based on the intensity In of the laser or the monochromatic radiation, by setting In Ib,n(n,Tn). The definitions of entropy and thermodynamic temperature for optical radiation are very important for analyzing optical energy conversion systems, such as solar cells, thermophotovoltaic generators, luminescence devices, and laser cooling apparatus.13,14 Assume that the monochromatic radiation is from a thermodynamic equilibrium state, such as a resonance cavity that allows only a single mode to exist. The spectral entropy intensity of unpolarized radiation can be written as follows:13 Ln

2kBn2 c2

c a1

c2In c2In c2In blna1

b lna bd 2hn3 2hn3 2hn3 2hn3 c2In

(8.55)

Thermodynamically, the monochromatic radiation temperature can be defined by 'Ln kB 2hn3 1 b lna1 2 b a 'In n Tn(n) hn c Iv

(8.56)

This is indeed Planck’s distribution of intensity at the same temperature. The expressions can be modified for polarized radiation. When the energy intensity is very high, Eq. (8.56) approaches Tn(n) c2Iv /(2kBn2), which is in the Rayleigh-Jeans limit. The radiation temperature will be proportional to the intensity of the monochromatic radiation and can exceed 1010 K, with a 1-mW He-Ne laser at 632.8-nm wavelength.15 Therefore, for lasers with a moderate intensity, Tn tends to be so high that the entropy is nearly zero; hence, the interaction of a laser beam with a material can be considered as work interaction. If a collimated beam is randomly scattered by a rough surface, the scattered radiation will have a much lower intensity because of the increase in the solid angle. The process is accompanied with an entropy increase and is thus irreversible. It is not possible to increase the intensity of the scattered light, back to their original intensity, without leaving any net effect on the environment of the photon system. On the other hand, if a nearly collimated light is split into two beams with a beamsplitter, the transmitted and reflected beams can interfere with each other to reconstruct the original beam. This process is reversible because the two beams are correlated. The correlated beams have lower entropy than those with the same intensity at thermodynamic equilibrium. The concept of temperature is applicable only if the maximum-entropy state has been reached.15 While the definition of the monochromatic radiation temperature is similar to that of the radiance temperature, the two concepts are quite different. In the definition of radiance temperature, the quality (entropy) does not

FUNDAMENTALS OF THERMAL RADIATION

303

enter into play. On the other hand, the definition of the monochromatic temperature for incoherent radiation is for a state that is equilibrium in a certain wavelength and angular ranges. Consider a gray-diffuse body, for which the emissive power is proportional to the blackbody emissive power, at any frequency and angle of emission. The monochromatic temperature calculated from Eq. (8.56), however, is frequency dependent. This is because the emitted radiation, as a whole, cannot be considered as a blackbody at any temperature. Thermal radiation of this type has been called dilute blackbody radiation.13 This simple example shows that photons at any given frequency can be considered as in a thermodynamic equilibrium but not necessarily in equilibrium with photons at other frequencies. When radiation has two linear polarizations with a different intensity, the monochromatic temperature will be different, even for the two polarizations. In general, it is a function of frequency, direction, and polarization. The requirement is that each subsystem be in a thermodynamic equilibrium, even though it is not in equilibrium with other subsystems at the same spatial location. Photons at different frequencies, with different polarization states, or propagating toward different directions, can coexist in their own equilibrium state without any interaction with each other. The concept may be called partial equilibrium, as in the case when the two parts of a cylinder were separated by a moveable adiabatic wall. The mechanical equilibrium would be established to maintain the same pressure on each side, but the temperatures may be different from each other because thermal equilibrium is reached only inside each portion but not between them. Another example is in ultrafast laser heating of metals, as discussed in Chap. 7, where the electron and phonon systems can be treated as in separate equilibrium states but not in equilibrium with each other. The concept of entropy intensity has recently been applied by Caldas and Semiao to study the entropy generation in an absorbing, emitting, and scattering medium, based on the equation of radiative transfer (ERT) introduced in Sec. 2.4.3.16 The key is that the change in entropy in an elemental path length equals the change in intensity divided by the radiation temperature. The entropy change at steady state can be obtained from Eq. (2.53) in Chap. 2 as follows: alIb,l Il(r) (al sl)Il sl dLl (r, )dr

Tl(Il) Tl(Il) 4p 34p Tl(Il) dj

(8.57)

Like Il, the entropy intensity Ll is a function of wavelength, location, and direction. Note that Ib,l Ib,l(l,Tg), where Tg is the local temperature. For an anisotropic radiation field, Tl(Il) would be different for different directions. For nonblackbody radiation, Tl(Il) will be a function of wavelength. The term Il/Tl(Il), however, is not the same as Ll. Integration of Eq. (8.57) over the solid angle of 4p at all wavelengths in a volume element yields the entropy that is transferred out of the control volume. Furthermore, the entropy change in the control volume is equal to the total energy absorbed divided by Tg. The energy rate received per unit volume can be expressed as `

. q 3 3 al(Il Ib,l)d dl

(8.58)

0 4p

Because the entropy change is the sum of the net entropy transferred into the system and the entropy generation by irreversibility, we can express the volumetric entropy generation rate as `

`

al sl al 1 1 . sgen 3 3 alIb,l c d d d l 3 3 c d Ild dl Tg Tg Tl(Il) Tl(Il) 0 4p

0 4p `

sl Il(r)

3 3 (r,)dr d ddl c 4p 3 Tl(Il) 0 4p

4p

(8.59)

CHAPTER 8

304

For an isotropic field, Il is independent of the direction, and scattering does not contribute to the entropy generation. In this case, the entropy generation becomes `

. 1 1 sgen 3 3 al AIb,l Il B c d ddl Tg Tl(Il)

(8.60)

0 4p

The entropy generation is always greater than zero, because the intensity is an increasing function of temperature, unless the medium is at thermal equilibrium. When a surface is involved in radiative heat transfer, the entropy generation rate per unit area can be expressed as ` 2p p/2

ssgen 3 3 3 c 0 0 0

Iin,l Iout,l (Lin,l Lout,l)d cos u sin u dudfdl Tw

(8.61)

where Tw is the wall temperature, subscripts “in” and “out” signify the energy or entropy intensity to and from the surface, respectively. If the surface is not a blackbody, the outgoing intensity includes both the emitted and reflected intensities. An alternative approach is to integrate the intensity over the whole sphere with a solid angle of 4p. In Eq. (8.61), the entropy intensity is related to the energy intensity by Eq. (8.55), which is recast in terms of wavelength as follows: Ll(l,Il)

2kBc l4

c a1

l5Il l5Il l5Il blna1

b lna bd 2hc2 2hc2 2hc2 2hc2 l5Il

(8.62)

The use of Eq. (8.62) may be disputed when multiple reflections occur. The intensity of the emitted radiation is less than that of the blackbody and is reduced by each reflection. The question still remains as whether the blackbody intensity should be used to calculate the entropy or the actual intensity after each reflection or the combined intensity at any given location. An example is a system of two large parallel plates, separated by vacuum. One of the plates is at a temperature T1 and is diffuse-gray with an emissivity of 0.5. The other plate is insulated and is a perfect reflector (i.e., zero emissivity). It is clear that a thermal equilibrium will be established in the cavity after a long time. Again, the separation distance is much larger than the thermal radiation wavelengths. The radiation leaving surface 1 includes the emitted rays, as well as the first-order and higher-order reflected rays. An attempt to define the entropy of the emitted ray and each reflected ray will result in a total entropy intensity greater than the entropy intensity calculated based on the blackbody intensity Ib,l(l,T1). Therefore, to apply the previous analysis in a consistent way and to obtain meaningful results, we must make the following hypotheses: • The intensity at any given location is additive regardless of where it originates from, as long as it falls within the same solid angle and wavelength intervals. While this sounds obvious, it is untrue when interference effects become important. The resulting intensity is called the combined intensity. • The monochromatic radiation temperature Tl, defined in Eq. (8.56), is a function of the combined intensity and is in general dependent on the direction and wavelength. The effect of polarization is neglected to simplify the problem. Equation (8.56) must not be applied to each of the reflected or scattered rays. The physical significance is that all the photons, with the same wavevector and frequency, can be considered as a subsystem that is at thermodynamic equilibrium with the temperature Tl[Il(l,u,f)]. • The entropy intensity is defined based on the combined intensity, according to Eq. (8.62). While entropy must be additive, the entropy of all individual rays must be calculated

FUNDAMENTALS OF THERMAL RADIATION

305

based on the monochromatic temperature of the combined intensity. Because the number of photons, intensity, and entropy are additive, the fraction of the entropy of each ray is the same as the ratio of the intensity of that ray to the combined intensity. With the theories presented in this section, one should be able to perform a second law thermodynamic analysis for a given system, involving radiative transfer of energy. Zhang and Basu investigated entropy flow and generation considering incoherent multiple reflections.17 There exist different approximations in analyzing the entropy of radiation. For example, the method of dilute blackbody radiation uses a dilution factor and defines an effective temperature for each wavelength.13 When the process is very complicated, it appears that such an effective temperature cannot be easily defined and this definition cannot be applied to multiple reflections. Entropy generation is usually accompanied by the generation of heat, such as heating by friction, electrical resistor, chemical reaction, or absorption of solar radiation. On the other hand, it appears that entropy generation can occur in radiation without the generation of heat, such as by scattering. The definition of inelastic scattering is based on the conservation of energy (wavelength) and momentum, which does not impose any constraints on the reversibility. Further research is much needed in order to better understand the nature of entropy of radiation and determine the ultimate efficiency of photovoltaic cells and other radiative processes, including laser cooling and trapping. Another area of possible application of radiation entropy is in nanoscale heat conduction using the EPRT, as discussed in Chap. 7. The entropy concept may be extended to the phonon system by defining radiation entropy and entropy intensity of phonons.

8.2.4 Limitations of Planck’s Law The concept that a blackbody surface absorbs all radiant energy that is incident upon it is purely from the geometric-optics point of view, in which light travels in a straight line and cannot interact with an object that does not intercept the light ray. Another example of the geometric-optics viewpoint is that the transmittance of an iris (open aperture) should be 1, i.e., all the radiation incident on the opening will go through. However, for an aperture whose diameter is comparable to the wavelength of the incident radiation, diffraction may become important and, as a result, the transmittance can be less or even greater than 1. Due to the diffraction effect, a particle that is sufficiently small compared to the wavelength will interact with the radiation field, according to the scattering and absorption cross sections, which can be greater than the projected surface area. In some cases, it is possible for the object to absorb more energy than the product of the radiant flux and the projection area. The absorptance can be greater than 1 and thus exceeds the limit set by a blackbody. When such an object is placed in an isothermal enclosure, the emitted energy will be greater than that from a blackbody having the same dimensions. This anomaly has been discussed in detail by Bohren and Huffman.7 The energy density near the surface within a distance less than the wavelength can be much greater than that given by Eq. (8.41) and increases as the distance is further reduced. When two objects are placed at a distance much smaller than the characteristic wavelength of thermal radiation, i.e., in the near field, photon tunneling can occur and cause significant enhancement of the energy transfer. In recent years, there have been numerous studies of light transmission through small apertures, radiation heat transfer at nanometer distances, and light emission from nanostructures.18 This is still an open field with many new developments as well as controversies. We will study these phenomena and the underlying physics in the following two chapters. The entropy concept and the second law limitation have not been applied to the study of near-field energy transfer.

CHAPTER 8

306

8.3 RADIATIVE PROPERTIES OF SEMI-INFINITE MEDIA 8.3.1 Reflection and Refraction of a Plane Wave Consider radiation incident from one medium to another at the interface or the boundary. The boundary that separates the media is assumed to be a smooth plane and extends to infinity. Each medium is homogeneous and isotropic; so, there is no scattering within the medium. Therefore, the electric response can be characterized by the relative permittivity or dielectric function e, and the magnetic response can be characterized by the relative permeability m. For nonmagnetic materials, the refractive index is related to the dielectric function by n !e. Keep in mind that these quantities are, in general, complex and frequency dependent. The real and imaginary parts of the refractive index are often called the optical constants. In this section, we present the general formulation for both magnetic and nonmagnetic materials. For certain crystalline and amorphous solids, like quartz and glass, the refractive index is real in a wide spectral region and is the only parameter needed to fully characterize the optical response of the material. In such a case, the expression can be largely simplified and the results can be easily comprehended. The reduced results will also be presented because of their importance to numerous engineering problems. The incident radiation is a monochromatic plane wave with an angular frequency v. As shown in Fig. 8.7, the wavevector of the incident wave is k

1 (k1x,0,k1z), and the surface

(a) TE wave

(b) TM wave

FIGURE 8.7 Illustration of reflection and transmission at an interface: (a) TE wave or s polarization. (b) TM wave or p polarization.

normal defines the plane of incidence, which is the x-z plane. The wavevectors of the reflected and transmitted waves must lie in the same plane. The angle of incidence u1 is the angle between the incident wavevector and the z direction, i.e., sin u1 k1x /k1 and cos u1 k1z /k1, where k21 k21x k21z m1e1v2/c20. It is common to study the reflection and the refraction for linearly polarized waves, with either the electric or magnetic field being parallel to the y-axis, because other polarizations can be decomposed into the two polarization components. When the electric field is in the y direction, as shown in Fig. 8.7a, the wave is called a transverse-electric (TE) wave or is said to be perpendicularly (s) polarized. The incident electric field can be expressed as follows by omitting the time-harmonic term of eivt hereafter: Ei y^Ei eik1z z ik1x x

(8.63)

FUNDAMENTALS OF THERMAL RADIATION

307

The boundary conditions state that the tangential components of both E and H must be continuous at the interface. This implies that the x component of the wavevector must be the same for the incident, reflected, and transmitted waves, i.e., k1x k2x kx. Because the angle of reflection must be the same as the angle of incidence (specular reflection), we have k 1 (kx,0,k1z). For the transmitted or refracted wave, we have k2 (kx,0,k2z) and sin u2

kx n1sin u1 n k2 2

(8.64)

which is called Snell’s law. It can be easily visualized by observing the bended image of a chopstick in a bowl of water. Note that k22z k22 k2x m2e2v2/c20 k2x k22 cos2 u2. Generally speaking, the wavevector components and the refractive indices may be complex. Complex angles can be defined so that Eq. (8.64) is always valid. Near the interface, the nonzero components of the electric and magnetic fields are Ey e

Hx µ

and

(Ei eik1zz Ereik1zz)eikx x for z 0 Eteik2 zzeikx x for z 0

k1z vm m (Eieik1zz Ereik1zz)eikx x for z 0

Hz µ

0

1

k2z vm m Eteik2 z zeikx x for z 0 0

(8.65)

(8.66)

2

kx ik1z z ik1zz ikxx )e for z 0 vm0 m1 (Eie Ere kx ik2 zz ikxx for z 0 vm0 m2 Ete e

(8.67)

where Ei, Er, and Et are, respectively, the amplitudes of the incident, reflected, and transmitted electric fields at the interface. It is further assumed that kx is real so that the amplitude of the field is independent of x. The Fresnel reflection and transmission coefficients for a TE wave are defined as r12,s Er /Ei and t12,s Et /Ei, respectively. Boundary conditions require that Ey and Hx be continuous at z 0. From Eq. (8.65) and Eq. (8.66), we obtain 1 r12,s t12,s and (k1z /m1)(1 r12,s) (k2z /m2)t12,s; thus,

and

r12,s

k1z /m1 k2z /m2 Er Ei k1z /m1 k2z /m2

(8.68a)

t12,s

2k1z /m1 Et Ei k1z /m1 k2z /m2

(8.68b)

which are generally applicable, as long as each medium is homogeneous and isotropic.6 For nonmagnetic materials, the previous equations can be written as follows:

and

r12,s

n1 cos u1 n2 cos u2 n1 cos u1 n2 cos u2

(8.69a)

t12,s

2n1 cos u1 n1 cos u1 n2 cos u2

(8.69b)

CHAPTER 8

308

The directional-hemispherical spectral reflectivity, or simply reflectivity, rrl is given by the ratio of the reflected energy flux to the incident energy flux, and the directionalspectral absorptivity arl is the ratio of the transmitted energy flux to the incident energy flux, since all the photons transmitted through the interface will be absorbed inside the second medium. We use terms ending with “-ivity” only for a perfect interface and those with “-tance” for surfaces with roughness and coatings. The energy flux is related to the timeaveraged Poynting vector, defined in Eq. (8.18b). From Eq. (8.65) to Eq. (8.67), the x and z components of the Poynting vector at the interface (z S 0) in medium 1 are

and

kS1x l

k*x 1 Re c (E Er)(E*i E*r)d 2 v m0 m*1 i

(8.70a)

kS1z l

k*1z 1 (E Er)(E*i E*r)d Re c 2 vm0 m*1 i

(8.70b)

It can be seen that, in general, the reflected wave and the incident wave are coupled and the energy flow cannot be separated by a reflected flux and an incident flux. Under the assumption that medium 1 is lossless (nonabsorbing or nondissipative) and k2x k21, we can write (8.71)

kS1z l kSiz l kSrz l where

kSiz l

k1z 2 PE P 2v m0 m1 i

and

kSrz l

k1z 2 PE P 2v m0 m1 r

(8.72)

If medium 1 is lossy, there will be additional terms associated with Ei Er* and Ei*Er. In this case, the power flow normal to the interface cannot be separated as forward and backward terms because of the cross-coupling terms. Therefore, the lossless condition in medium 1 is required in order to properly define the energy reflectivity; see Salzberg (Am. J. Phys., 16, 444, 1948) and Zhang (J. Heat Transfer, 119, 645, 1997). This is usually not a problem when radiation is incident from air or a dielectric prism onto a medium. The power reflectivity can be defined based on the z components of the reflected and incident Poynting vectors; therefore, rrl,s(u1) ZEr Z 2@ZEi Z 2 Zr12,s Z 2

(8.73)

The Poynting vector at the interface in medium 2 can be written as kSt l

k*x x^ k2z* ^z 2 1 b P Et P Rea 2v m0 m*2

(8.74)

which is not parallel to Re(k2) unless Im(m2) 0. Recall that the plane of constant phase is perpendicular to Re(k2). If medium 2 is dissipative, Im(k2) is parallel to the z-axis and the amplitude will vary along the z direction. The wave becomes inhomogeneous in medium 2, except when kx 0 (normal incidence). The definition of the transmitted energy flux at the interface is based on the projected Poynting vector in the z direction. Hence, the absorptivity is the ratio of the z components of the transmitted and incident Poynting vectors, viz., arl,s(u1)

Re(k2z /m2) 2 Pt P Re(k1z /m1) 12,s

(8.75)

Note that Re(k2z /m2) Re(k2z* /m*2), and Re(k1z /m1) k1z /m1 since medium 1 is lossless. It can be shown that rrl,s arl,s 1, as required by energy conservation: kS1z l kS2z l at z 0. For nonmagnetic and nearly nondissipative materials, we have arl,s(u1)

n2 cos u2 2 t n1cos u1 P 12,s P

(8.76)

FUNDAMENTALS OF THERMAL RADIATION

309

The reflection and transmission coefficients for the transverse-magnetic (TM) wave or parallel (p) polarization are defined as the ratios of the magnetic fields: r12,p Hr /Hi and t12,p Ht /Hi, respectively. Hence, r12,p

k1z /e1 k2z /e2 Hr Hi k1z /e1 k2z /e2

(8.77a)

t12,p

2k1z /e1 Ht Hi k1z /e1 k2z /e2

(8.77b)

In the case of nonmagnetic materials, we obtain

and

r12,p

n2 cos u1 n1cos u2 n2 cos u1 n1cos u2

(8.78a)

t12,p

2n2 cos u1 n2 cos u1 n1cos u2

(8.78b)

At normal incidence, the reflection coefficients in Eq. (8.69a) and Eq. (8.78a) are related by r12,s

n1 n2 r12,p n1 n2

(8.79)

When both n1 and n2 are real and n1 n2, the electric field will experience a phase reversal (phase shift of p) upon reflection but the magnetic field will not. On the other hand, if n1 n2, it is the magnetic field that will experience a phase reversal. In fact, based on Maxwell’s equations, the electric and magnetic quantities obey a duality, when re 0, and can be interchanged with the following substitutions: E S H and H S E. Note that e and m, as well as the polarization states s and p, should also be interchanged. The Poynting vector for a TM wave is kSl Re(k/e)ZHy Z 2/(2ve0), which is not parallel to Re(k) when Im(e2) 2 0. Upon refraction into an absorbing medium, the waves become inhomogeneous and the Poynting vectors for different polarizations may split into different directions; see Halevi and Mendoza-Hernandez (J. Opt. Soc. Am., 71, 1238, 1981). Nevertheless, the constant-amplitude plane is always perpendicular to the z direction because the amplitude cannot change along the x-y plane. The reflectivity for p polarization is rrl,p(u1) Zr12,p Z 2

(8.80)

Hence, the absorptivity becomes arl,p(u1)

Re(k2z /e2) 2 Pt P Re(k1z /e1) 12,p

(8.81)

For nonmagnetic and nearly nonabsorbing materials, we have arl,p(u1)

n1 cos u2 2 t n2 cos u1 P 12,p P

(8.82)

If the incident wave is unpolarized or circularly polarized, the reflectivity can be obtained by averaging the values for p- and s-polarized waves, i.e., rrl

rrl,p rrl,s 2

(8.83)

CHAPTER 8

310

The reflectivity for radiation incident from air (n1 < 1) to a dielectric medium (n2 2) and that from the dielectric to air are shown in Fig. 8.8 for each polarization as well as for the

1

1 TE wave TM wave Average

0.6

n1 = 1 n2 = 2

0.4 0.2 0

n1 = 2

0.8 Reflectivity, r'l

Reflectivity, r'l

0.8

n2 = 1

qc = 30° Total internal reflection

0.6

TE wave TM wave Average

0.4 0.2

qB = 63.4° 0

10 20 30 40 50 60 70 80 90 Angle of incidence, q1 (deg) (a) Incident from air to a medium

qB = 26.6° 0

0

10 20 30 40 50 60 70 80 90 Angle of incidence, q1 (deg) (b) Incident from a medium to air

FIGURE 8.8 Reflectivity versus the angle of incidence between air and a dielectric.

unpolarized incident radiation. When n1 n2, the reflectance will reach 1 at u1 uc sin1(n2 /n1). This angle is called the critical angle, and total internal reflection occurs at angles of incidence greater than the critical angle. This is the principle commonly used in optical fibers and waveguides, since light is trapped inside the high-index material and propagates along the medium. It can be seen that in total internal reflection, kx k2 while k2z becomes purely imaginary. The amplitude of the wave is exponentially attenuating in the positive z direction. This is similar to Eq. (8.37) and is an evanescent wave, as shown in Fig. 8.3. The time-averaged Poynting vector is zero in the z direction. Hence, no energy is transmitted across the boundary. For the TE wave, the reflectivity increases monotonically with the angle of incidence and reaches 1 at the grazing angle (90) or at the critical angle when n1 n2. The reflectivity for the TM wave, on the other hand, goes through a minimum that is equal to zero. The angle at which rrl,p 0 is called the Brewster angle, given by uB tan1(n2/n1) for nonmagnetic materials. For p polarization, all the incident energy will be transmitted into medium 2, without reflection at the Brewster angle. This phenomenon has been used to build polarizers and transmission windows in absolute cryogenic radiometers. The physical mechanism of reflection can also be understood as the re-emission by the induced electric dipoles in the medium, based on the Ewald-Oseen extinction theorem. At the Brewster angle, the electric dipoles induced in the material align in the direction of the reflected wave, and the refracted wave is perpendicular to the reflected wave (i.e., u1 u2 90). The reflective power goes to zero because an electric dipole cannot radiate along its own axis. The situation is changed when magnetic materials are involved, such as a negative index material. The fields radiated by both the induced electric dipoles and magnetic dipoles are responsible for the reflection. The Brewster angle can occur for either polarization when the radiated fields cancel each other. A detailed discussion can be found from the publication of Fu et al.19 In an absorbing medium, there is a drop in reflectance for p polarization, but the minimum is not zero. Furthermore, there exists a principal angle at which the phase difference between the two reflection coefficients equals to 90 and the ratio of the reflectance for the TM and TE waves is minimized.6

FUNDAMENTALS OF THERMAL RADIATION

311

The reflectivity for radiation incident from air (n1 < 1) or vacuum, at normal incidence, becomes rrl,n

(n2 1)2 k22 (n2 1)2 k22

(8.84)

for any polarization. It can be seen that the normal reflectivity will be close to 1, when either n2 V 1 or n2 W 1. The reflectivity is often large for most metals in the infrared because both n2 and k2 are large, whereas the reflectivity of a conventional superconductor approaches to 1 when the frequency is lower than that of the superconducting energy gap, since n2 S 0 in this case. On the other hand, rrl,n S 0 when n2 < 1 and k2 V 1. This can occur in a dielectric material at a wavelength in the infrared and for most metals in the x-ray region.

8.3.2 Emissivity Real materials have finite thicknesses. The assumption of semi-infinity or opaqueness requires that the thickness is much greater than the radiation penetration depth. This is usually not a problem for a metal in the visible or infrared spectral regions. When this is not the case, we are dealing with a transparent or semitransparent material, like a glass window. The radiative properties of semitransparent layers and thin films will be studied in the next chapter. Laser beams or light from a spectrophotometer do not extend to infinity and are not perfectly collimated. Nevertheless, as long as the diameter of the beam spot is much greater than the wavelength and the beam divergence is not very large, the directional-spectral reflectivity and absorptivity, calculated from the previous section, are applicable to most situations. According to Kirchhoff’s law, the directional-spectral emissivity is equal to the directional-spectral absorptivity of a material.1,2 This can be shown by placing the object into an enclosure at the thermal equilibrium. When the material is not at thermal equilibrium with the surroundings, the emissivity is defined based solely on the spontaneous emission and is an intrinsic material property that does not depend on the surroundings. On the other hand, the absorptivity is defined based on the net absorbed energy by treating stimulated or induced emission as negative absorption. Under proper definitions, Kirchhoff’s law is always valid in terms of the directional-spectral properties for any given polarization.1 The only assumptions are (a) the material under consideration is at a uniform temperature, at least within several penetration depths near its surface and (b) the external field is not strong enough to alter the material’s intrinsic properties, as in a nonlinear interaction. We can then compute the directional emissivity for an opaque surface or semi-infinite media, from the directional-hemispherical reflectivity for incidence from air or vacuum, using the following relation: erl 1 rrl

(8.85)

The emissivity is commonly calculated by averaging over the two polarizations. The preceding equation can be integrated to obtain the hemispherical emissivity 2p p/2

1 el p 3 3 erl cos u sin u du df

(8.86)

0 0

It can be seen from Fig. 8.8a that, when averaged over the two polarizations, the reflectivity changes little until the Brewster angle and then increases to 1 when the incidence angle approaches 90. The hemispherical emissivity for a nonmetallic surface is about 10% smaller than the normal emissivity. On the other hand, the hemispherical emissivity for

CHAPTER 8

312

metallic surfaces is about 20% greater than the normal emissivity. Diffuse emission is a good first-order approximation, even though the surface is smooth. Thus, the hemispherical emissivity may be approximated by the normal emissivity. In most studies, the emissivity is calculated from the indirect method, based on the reflectivity and Kirchhoff’s law, discussed earlier. Direct calculations can be accomplished by considering the emission from the material, and the internal absorption and transmission. Another method is based on the fluctuation-dissipation theorem, in which the emission arises from the thermally induced fluctuating currents inside the material. The fluctuational electrodynamics is essential to the study of near-field radiation and will be discussed in detail in Chap. 10. The total-hemispherical emissivity can be evaluated using Planck’s distribution. Therefore, `

`

30 el(l)eb,l(l,T )dl

30 el(l)eb,l(l,T )dl etot

`

3 eb,l(l,T )dl

sSBT 4

(8.87)

0

The total emissivity depends on the surface temperature and the spectral dependence of the optical constants. Pure metals usually have a very low emissivity, and the emissivity increases due to surface oxidation. Spectrally selective materials that appear to be reflective to the visible light may exhibit a large total emissivity, greater than 0.9, at room temperature; examples are white paint and paper. An earlier compilation of the radiative properties of many engineering materials can be found in Touloukian and DeWitt.20 The use of surface microstructure to modify the emission characteristics will be discussed in the next chapter.

8.3.3 Bidirectional Reflectance Real surfaces contain irregularities or surface roughnesses that depend on the processing method. A surface appears to be smooth if the wavelength is much greater than the surface roughness height. A highly polished surface can have a roughness height on the order of nanometers. Some surfaces that appear “rough” to human eyes may appear to be quite “smooth” for far-infrared radiation. The reflection of radiation by rough surfaces is more complicated. For randomly rough surfaces, there often exist a peak around the direction of specular reflection, an off-specular lobe, and a diffuse component. When the surface contains periodic structures, such as patterned surfaces or micromachined surfaces, diffraction effects may become important and several peaks may appear. The bidirectional reflectance distribution function (BRDF), which is a function of the angles of incidence and reflection, fully describes the reflection characteristics from a rough surface at a given wavelength. As illustrated in Fig. 8.9, the BRDF is defined as the reflected radiance (intensity) divided by the incident irradiance (flux) at the surface, i.e., fr(l,ui,fi,ur,fr)

dIr Ii cos uidi

[sr1]

(8.88)

where (ui,fi) and (ur,fr) denote the directions of incident and reflected beams, respectively, Ii is the incident irradiance (radiant flux), and dIr is the reflected radiance (intensity). In the experiment, the detector output signal is proportional to the solid angle di. The denominator of Eq. (8.88) gives the incident radiant power reaching the detector. Hence, the BRDF can be obtained from the following measurement equation: fr

Pr 1 Pi cos urdr

(8.89)

FUNDAMENTALS OF THERMAL RADIATION

313

FIGURE 8.9 Geometry of the incident and reflected beams in defining the BRDF.

where Pi and Pr are the incident and reflected powers reaching the detector.21 The directional-hemispherical reflectance can be obtained by integrating the BRDF over the hemisphere:1,2 rrl 3 fr cos ur dr 2p

(8.90)

An important principle of the BRDF is reciprocity, which states symmetry of the BRDF, with regard to reflection and incidence angles. In other words, the reflectance for energy incident from (ui, fi) and reflected to (ur, fr) is equal to that for energy incident from (ur, fr) and reflected to (ui, fi). Therefore, fr(l,ui,fi,ur,fr) fr(l,ur,fr,ui,fi)

(8.91)

For a diffuse or Lambertian surface, the BRDF is independent of (ur, fr) and is related to the directional-hemispherical reflectance as fr,diff rrl/p. On the other hand, the BRDF for an ideal specular, or mirrorlike, reflector can be represented as fr,spec

rrl d (u ui)df(fr fi p) cos ui u r

(8.92)

where the Dirac delta function d(j) is zero everywhere, except at j 0. Furthermore, the delta functions are normalized such that 12pdu(ur ui)df(fr fi p) dr 1. These examples clearly demonstrate that the BRDF is applicable to any kind of surfaces. In the next chapter, we will study the BRDF models based on geometric optics and physical optics, as well as rigorous solutions of the Maxwell equations. We will also discuss the effect of surface microstructures on the BRDF and how to characterize a rough surface.

314

CHAPTER 8

8.4 DIELECTRIC FUNCTION MODELS Unlike in dilute gases where the molecules are far apart, in solids, the closely packed atoms form band structures. Absorption in solids usually happens in a much broader frequency region or band. Free electrons in metals can interact with the incoming electromagnetic waves or photons, and cause a broadband absorption from the visible (or even ultraviolet) all the way to the microwave and longer wavelengths. For semiconductors especially with high impurity (doping) concentrations or at elevated temperatures, both the free electrons and holes contribute to the absorption process. The absorption of a photon makes the electron or the hole to transit to a higher-energy state within the same band. Therefore, free-carrier absorption is caused by intraband transitions. In order to conserve momentum, the carriers must also collide with ionized impurities, phonons, other carriers, grain boundaries, interfaces, and so forth. The collisions act as a damping force on the motion of carriers. The Drude model describes the oscillatory movement of an electron, driven by a harmonic field, which is subjected to a damping force. The model is simple in form and predicts the dielectric function of some metals fairly well in a broad spectral region, especially in the mid- and far-infrared. Absorption by lattice vibrations or bound electrons, which is important for insulators and lightly doped semiconductors, is due to the existence of electric dipoles formed by the lattice. A maximum absorption is achieved when the frequency equals the vibrational mode of the dipole, i.e., the resonance frequency, which is usually in the mid- to far-infrared region of the spectrum. The contribution of bound electrons is often modeled by the Lorentz model. Interband transition is the fundamental absorption process in semiconductors. An electron can be excited from the valence band to the conduction band by absorbing a photon, whose energy is greater than the energy gap Eg. Because the absorption by electrons is usually weak in semiconductors, a strong absorption edge is formed near the bandgap. In this transition process, both the energy and the momentum must be conserved. This section discusses the formulation for different contributions to the dielectric function. It should be noted that the real and imaginary parts of the dielectric function are interrelated according to the causality, which is discussed first. Because all naturally occurring and most of the synthesized materials are nonmagnetic at high frequencies, only nonmagnetic materials are considered so that m 1 and n !e in the following, except in Sec. 8.4.6. 8.4.1 Kramers-Kronig Dispersion Relations The real and imaginary parts of an analytic function are related by the Hilbert transform relations. Hendrik Kramers and Ralph Kronig were the first to show that the real and imaginary parts of the dielectric function are interrelated. These relations are called the KramersKronig dispersion relations or K-K relations for simplicity. The K-K relations can be interpreted as the causality in the frequency domain and are very useful in obtaining optical constants from limited measurements. The principle of causality states that the effect cannot precede the cause, or no output before input. Some important relations are given here, and a detailed derivation and proofs can be found from Jackson,5 Born and Wolf,6 and Bohren and Huffman.7 The real part er and the imaginary part es of a dielectric function are related by

and

` zes(z) 2 dz er(v) 1 p `3 2 2 0 z v

(8.93a)

` er(z) 1 s0 2v dz es(v) e v p `3 0 z2 v2

(8.93b)

0

FUNDAMENTALS OF THERMAL RADIATION

315

where s0 is the dc conductivity, ` denotes the principal value of the integral, and z is a dummy frequency variable. These relations can be written in terms of n and k as `

zk(z) 2 dz n(v) 1 p ` 3 2 2 0 z v

(8.94a)

` n(z) 1 2v k(v) p ` 3 dz 2 2 0 z v

(8.94b)

Equation (8.93) and Eq. (8.94) are the K-K relations, which relate the real part of a causal function to an integral of its imaginary part over all frequencies, and vice versa. A number of sum rules can be derived based on the K-K relations and are useful in obtaining or validating the dielectric function of a given material. The K-K relations can be applied to reflectance spectroscopy to facilitate the determination of optical constants from the measured reflectivity of a material in vacuum.9 For radiation incident from vacuum on a material at normal incidence, the Fresnel reflection coefficient is r (v) Zr (v)Zeif(v)

1 n(v) ik(v) 1 n(v) ik(v)

(8.95)

where ZrZ is the amplitude and f the phase of the reflection coefficient. The directionalhemispherical spectral reflectivity, expressed in terms of v, is rrv(v) rr* ZrZ 2

(8.96)

The amplitude and the phase are related, and it can be shown that ` ln rr (z) v v dz f(v) p ` 3 2 2 0 z v

(8.97)

The refractive index and the extinction coefficient can be calculated, respectively, from n(v)

and

k(v)

1 rrv 1 rrv 2 cos f 2rrv 2 sin f 2rrv 1 rrv 2 cos f 2rrv

(8.98)

(8.99)

8.4.2 The Drude Model for Free Carriers The Drude model describes frequency-dependent conductivity of metals and can be extended to free carriers in semiconductors. In the absence of an electromagnetic field, free electrons move randomly. When an electromagnetic field is applied, free electrons acquire a nonzero average velocity, giving rise to an electric current that oscillates at the same frequency as the electromagnetic field. The collisions with the stationary atoms result in a damping force on the free electrons, which is proportional to their velocity. The equation of motion for a single free electron is then $ . mex megx eE (8.100)

CHAPTER 8

316

where e is the absolute charge of an electron, me is the electron mass, and g denotes the strength of the damping due to collision, i.e., the scattering rate or the inverse of the relaxation time t. Assume motion under a harmonic field E E0eivt is of the form $ the electron . x x0eivt so that x ivx. We can rewrite Eq. (8.100) as . x

e/me E iv g

. ~ (v)E; therefore, the complex conductivity is The electric current density is J neex s 2 s0 ~ (v) nee /me s g iv 1 iv/g

(8.101)

where s0 nee2t/me is the dc conductivity, as discussed in Chap. 5. Equation (8.101) is called the Drude free-electron model, which describes the frequency-dependent complex conductivity of a free-electron system in terms of the dc conductivity and the scattering rate, in a rather simple form. The electrical conductivity approaches to the dc conductivity at very low frequencies (or very long wavelengths). The dielectric function is related to the conductivity by Eq. (8.28); thus, e(v) e`

s0 g e0(v2 igv)

(8.102)

where e`, which is on the order of 1, is included to account for contributions, other than the contribution of the free electrons, that are significant at high frequencies. There exist several transitions at the ultraviolet and visible regions for metals, such as interband transitions. Note that when v S ` , the real part of the dielectric function of all materials should approach unity, as can be seen from Eq. (8.93a). In the low-frequency limit when v V g, ~ (v S 0) < s and es W er. Therefore, s 0 n Zsi sr Z 4

(9.82a)

rsp Z(hr # si)(vi # sr)rs (vr # si)(hi # sr)rp Z 2 > Zsi sr Z 4

(9.82b)

rpp Z(hr # si)(hi # sr)rs (vr # si)(vi # sr)rp Z 2 > Z si sr Z 4

(9.82d)

rps Z(vr # si)(hi # sr)rs (hr # si)(vi # sr)rp Z 2 > Zsi sr Z 4

(9.82c)

where r denotes Fresnel’s reflection coefficient. The subscripts s and p stand for each polarization. On the left-hand side, the double subscripts indicate the polarization for the incidence and the reflection, respectively. In terms of the microfacet reflectivities, the reflected energies Gr,s and Gr,p are related to the incident energies Gi,s and Gi,p by c

Gr,s r rps Gi,s d dc d c ss rsp rpp Gi,p Gr,p

(9.83)

The reflectivity is defined as the ratio of the reflected energy Gr Gr,s Gr,p to the incident energy Gi Gi,s Gi,p; thus, it depends on the polarization state of the incident wave. To facilitate the calculation, the incident energy of each ray bundle is set to unity such that (Gi,s, Gi,p ) (1, 0) for s polarization, (Gi,s, Gi,p (0, 1) for p polarization, and (Gi,s,Gi,p) (0.5,0.5) for random polarization (i.e., unpolarized incidence). For the first reflection, Gr,s and Gr,p are calculated from Eq. (9.83). For multiple reflections, the previously reflected energies are substituted for Gi,s and Gi,p, and the next reflected energy is updated according to Eq. (9.83). Each ray bundle is traced until it leaves the surface, and then, the information of its direction and energy for each polarization is stored in a database. Because the energy of the bundle is reduced after each reflection, there is no need to use random numbers to decide whether a ray bundle is reflected at the microfacet or not. In a special case, when the planes of incidence and reflection are identical, the polarization state is maintained for either s or p polarization if only the first-order scattering has

RADIATIVE PROPERTIES OF NANOMATERIALS

367

been considered. This means that the vectors hi and hr are either parallel or antiparallel (refer to Fig. 9.20); consequently, hi sr 0 and hr si 0. It can be seen from Eq. (9.82) that rsp rps0, rss 兩rs兩2, and rpp兩rp兩2. The corresponding BRDF is called the inplane BRDF (fr fi or fr fi 180). Nevertheless, the cross-polarization term is nonzero for the in-plane BRDF when multiple scattering is significant. After a large number of ray bundles have been traced, the BRDF can be calculated in terms of the energy of the ray bundles: fr(l,ui,fi,ur,fr)

Gr(ur,fr) 1 Gi(ui,fi) cosur r

(9.84)

where Gi(ui, fi) is the total energy of the incident ray bundles, and Gr(ur,fr) is the energy of the ray bundles leaving the surface within the solid angle r, in the direction (ur, fr). The integration of the BRDF yields the directional-hemispherical reflectance. The directional emittance can be obtained according to the conservation of energy and Kirchhoff’s law.

9.5.3 Surface Characterization In most studies, surface roughness is assumed to satisfy Gaussian statistics in the derivation of the BRDF model and for the surface generation in the Monte Carlo simulation. Furthermore, the roughness statistics of 2-D rough surfaces is assumed to be isotropic in most publications so that the autocorrelation function is independent of the direction. However, the Gaussian distribution may miss important features of natural and man-made rough surfaces that are strongly anisotropic. Before the invention of the AFM, the surface profile was usually measured with a mechanical profiler that scans the surface line-by-line. Some mechanical stylus profilers can measure rough surfaces with a vertical resolution of a few nanometers. However, the lateral resolution is usually on the order of 1 m due to the large radius of the stylus probe. Because the radius of curvature of the probe tip is in the range from 5 to 50 nm, an AFM can provide detailed information on the topography of a small area on the microrough surfaces, with a vertical resolution of subnanometers and a lateral resolution around 10 nm. The result is stored in an array, containing the height information, z(m, n), where m 1, 2, . . . , M and n 1, 2, . . . , N are the points along the x and y directions, respectively. To evaluate the 2-D slope distribution p(zx, zy), each surface element is determined by the four closest nodes in the data array. The four-node element can be considered as two triangular surfaces with a common side. The surface normals for the two triangles can be averaged to give the mean slope of the surface element such that zm 1,n zm,n zm 1,n 1 zm,n 1

2l 2l zm,n 1 zm,n zm 1,n 1 zm 1,n zy

2l 2l zx

(9.85a) (9.85b)

where l is the lateral distance between adjacent data points.26 The SDF can be determined by evaluating the slopes of all measured surface elements. For a scan area of 100 100 m2, the lateral interval l ⬇ 0.2 m, when the data are stored in a 512 512 array. The 2-D SDFs from the AFM measurement in the tapping mode, for two lightly doped 100 single-crystal silicon surfaces, are shown in Fig. 9.21.27 In the contact mode, lateral or shear forces can distort surface features and reduce the spatial resolution. Thus, deep valleys may not be correctly measured. The AFM scanning performed in the tapping mode with sharper silicon tips allows measuring precipitous slopes. The two SDFs are non-Gaussian and anisotropic, although the anisotropy of Si-1 is not as striking as that of Si-2. The SDF of Si-1

368

CHAPTER 9

FIGURE 9.21 2-D slope distribution obtained from AFM topographic measurements for two samples: (a) Si-1; (b) Si-2.

contains only one dominant peak at the center, indicating that a large number of microfacets are only slightly tilted. The SDF of Si-2 also has a dominant peak at the center, though smaller than that of Si-1. Four side peaks can also be seen that are nearly symmetric. These side peaks are associated with the formation of {311} planes, during the chemical etching in the (100) crystalline wafer.26,27 The angle between the (100) plane and any of the four (311) planes is cos1(3/ 211) 25.2, which is close to the location of the observed side peaks.

9.5.4 BRDF Measurements The BRDF of silicon wafers was measured with a laser scatterometer, named as three-axis automated scatterometer (TAAS), shown schematically in Fig. 9.22.30 The sample is vertically mounted. Three rotary stages, automatically controlled by a computer, are used to change incidence and reflection directions. One rotates the sample around the y-axis to change the incidence angle ui, another rotates detector A in the x-z (horizontal) plane to change the reflection angle ur, and the third rotates the arm of detector A out of the x-z plane to change the azimuthal angle fr for out-of-plane measurements. Manual rotation of the sample on a sample holder around the z-axis adjusts the azimuthal angle fi. The incident laser beam is parallel to the optical table (x-z plane). A diode laser system serves as an optical source, and a lock-in amplifier, connected with a diode laser controller, modulates the output optical power at 400 Hz. The wavelength can be selected by replacing the fiber-coupled diode laser, and a number of diode lasers in the visible and the near-infrared are available. The diode laser is mounted on a thermoelectrically controlled stage to provide power stability within a standard deviation of 0.2%. An optical fiber is used to provide flexibility for optical access and alignment. The light from the output end of the fiber is in the horizontal plane. As shown in Fig. 9.22, the beam first passes through a collimator with a pair of lenses and a small aperture. A linear polarizer mounted on a dial allows the selection of polarization for light incident on the sample. The beamsplitter then divides the laser beam into two passes: one goes to the sample and the other to a stationary reference detector B. The light scattered by the sample is measured by detector A. The beam spot size on the sample is a few millimeters in diameter, and the measurement can be considered as a spatial average over the beam diameter. Si and Ge photodiode detectors measure the radiant power in the wavelength range from 350 to 1100 nm and from 800 to 1800 nm, respectively. The power collected at each detector is sent to a trans-impedance preamplifier that has nine decades of amplification range. The preamplifier has a linear frequency response from dc (zero frequency) up to a certain maximum frequency that is much greater than 400 Hz. The lock-in

RADIATIVE PROPERTIES OF NANOMATERIALS

369

FIGURE 9.22 Schematic of the three-axis automated scatterometer (TAAS) for BRDF measurements.

amplifier picks up only the phase-locked signals at 400 Hz, thereby eliminating the effect of background radiation or stray light without using a chopper. The measurement equation for the BRDF is given by fr(ui,fi,ur,fr) CI

VA VB cos ur r

(9.86)

where VA and VB are the outputs of detectors A and B, respectively, and r is the reflection solid angle, which is 1.84 104 sr, as determined by the area of a precision-machined aperture in front of the detector and the distance between this aperture and the beam spot on the sample. An instrument constant CI compensates the beamsplitter ratio and the difference in the responsivities of the two detectors. The BRDF within 2.5° of the retroreflection direction (ur ui and fr fi) cannot be measured since the movable detector blocks the incident beam. A PC performs the data acquisition and automatic rotary-stage control in a LabView environment. In the measurements, VA and VB are averaged over many measurements at a given position to reduce the random error. The relative uncertainty of the TAAS is estimated to be 5% for fr 0.1 through intercomparison with a reference standard instrument at NIST.30

CHAPTER 9

370

9.5.5 Comparison of Modeling with Measurements Figure 9.23 compares the predicted BRDFs based on the slope distribution with the BRDFs measured using TAAS at l 635 nm, for Si-2, which is strongly anisotropic.27 For clarity, 0.5

0.5 (a) qi = 0°, f i = 0°

frcosθr (sr−1)

0.3 0.2 0.1 0.0 −90

0.3 0.2 0.1

−60

−30 0 30 60 Observation angle, qobs (deg)

90

0.0

−30

0

30

60

90

60

90

(d) q i = 45°, f i = 45° 0.5 frcosqr (sr−1)

frcosqr (sr−1)

−60

0.6 (c) q i = 45°, f i = 0°

0.4 0.3 0.2 0.1 0.0 −90

−90

Observation angle, qobs (deg)

0.6 0.5

Si-2

0.4

Monte Carlo (MSM)

0.4 frcosqr (sr−1)

(b) q i = 0°, f i = 45°

TAAS data

0.4 0.3 0.2 0.1

−60

−30

0

30

60

Observation angle, qobs (deg)

90

0.0 −90

−60

−30

0

30

Observation angle, qobs (deg)

FIGURE 9.23 Comparison of Monte Carlo model based on the MSM and the measured in-plane BRDF for Si-2. The observation angle uobs is the same as the reflection polar angle when f r f i p and negative refraction polar angle when fr fi.27

only the prediction using the MSM is presented. The predictions with the SGM and the analytical model yield a similar agreement with experiments.26,27 As can be seen from Fig. 9.23a, the prediction and the measurement agree well, except near uobs 0, where the measurements can not be taken within 2.5° and the simulation has a large fluctuation. The simulation captures the general features and trends of the measured BRDF, while some discrepancies exist near the side peaks. For ui 0 and fi 45, as shown in Fig. 9.23b, the BRDF contains two large side peaks associated with the side peaks in the SDF for Si-2 at Zzx Z < Zzy Z < 0.38 in Fig. 9.21b. The Monte Carlo simulations also predict the side peaks located approximately at ur 57, which deviates somewhat from the measured value of 50. Based on Snell’s law, the inclination angle of microfacets is half of ur, at ui 0. Therefore, the measured side peaks in the BRDF correspond to an inclination angle 25, which is very close to the angle of 25.2 between any of the four {311} planes and the (100) plane. On the other hand, the predicted side peaks correspond to an inclination angle of 28.5°, which is almost the same as that calculated from the slope at Zzx Z Zzy Z 0.38. Consequently, the side peak position obtained from the BRDF measurement is more reliable than that predicted

RADIATIVE PROPERTIES OF NANOMATERIALS

371

by the Monte Carlo methods using the topographic data from the AFM measurement. Due to the artifacts in the AFM measurements, the BRDF values are underpredicted when 15 ur 50 and overpredicted when 50 ur 80. When, ui 45 the Monte Carlo method overpredicts the specular peak, presumably due to the limitation of geometric optics. The disagreement between the predicted and measured BRDFs, for 60 uobs 85, may be due to the combined result of the artifacts in the AFM measurement, the limitation of the GOA, and multiple scattering. For ui 45 and fi 45, a small side peak appears at uobs 60 in the measured curve and at uobs 71 in the predicted curve. This is believed to be due to microfacets with {111} orientation that have an inclination angle of 54.7°. The small side peak should occur around uobs 64.4 based on simple geometric arguments. Figure 9.24 shows the directional-spectral emittance measured using an integrating sphere coupled with a monochromator.31 The directional emittance was calculated from the 0.75 (a) Si-1

0.70 Emittance, e'l

Emittance, e'l

0.70

0.75

0.65 0.60

Anisotropic model Gaussian model

0.55 0.50 400

600

700

800

Wavelength, l (nm)

0.65 0.60

Anisotropic model Gaussian model

0.55

Experiment

500

(b) Si-2

900

1000

0.50 400

Experiment 500

600

700

800

900

1000

Wavelength, l (nm)

FIGURE 9.24 Comparison of the predicted and measured emittance of Si-1 and Si-2, in a polar angle approximately equal to 7.31

measured directional-hemispherical reflectance at an incidence angle of approximately 7. The emittance values calculated from the models based on Gaussian distribution and anisotropic slope distribution are compared with those obtained from experiments. For Si-1, which is nearly isotropic, the difference between the models is small and the agreement with the experiment is excellent. The combined uncertainty in the measurement is estimated to be 0.01, except at l 1000 nm, where the silicon wafer becomes slightly transparent. For Si-2, however, the Gaussian model underpredicts the emittance and there is a large enhancement of the emittance due to anisotropy. The Monte Carlo model, based on the MSM, significantly improves the prediction. Given the fact that the AFM surface topographic measurements may not perfectly match the actual surface slope distribution, an uncertainty of 0.01 has been estimated for the Monte Carlo model. It can be seen that the prediction agrees with the measurement better at short wavelengths, where geometric optics is more suitable. The out-of-plane BRDFs of Si-1 and Si-2, calculated with the MSM at ui 30, are presented in Fig. 9.25 as contour plots in a polar coordinates system.27 In these plots, the radial and azimuthal coordinates respectively correspond to ur and fr, and the z-axis represents fr cosur. The BRDFs depend little on fr around the specular direction, but the dependence becomes large as the angular separation from the specular peak increases. The region where the BRDF is independent of fr is broader for Si-1 than for Si-2. The predicted BRDFs for Si-2 display a strong specular reflection peak, together with the four large side peaks associated with {311} planes. In addition, a small side peak associated with a {111} plane appears at large ur, as illustrated in Fig. 9.25c at fr 294 and another in Fig. 9.25d at fr 45. The

372

CHAPTER 9

FIGURE 9.25 BRDF predicted by the MSM at ui 30 for random polarization.27 (a) Si-1 at fi 0. (b) Si-1 at fi 45. (c) Si-2 at fi 0. (d) Si-2 at fi 45. In the polar contour plots, the radial coordinate corresponds to ur , and the azimuthal coordinate corresponds to fr.

actual magnitudes of the small side peaks may be smaller than those predicted by the MSM, and their positions may shift toward smaller ur. Nevertheless, Fig. 9.25 indicates that the Monte Carlo method is an effective technique for studying the BRDFs for anisotropic surfaces.

9.6 SUMMARY This chapter provided a detailed treatment of the radiative properties of stratified media based on the electromagnetic wave theory, considering partial coherence, and extended to the discussion of periodic structures, i.e., photonic crystals. A discussion of the coupledwave analysis was also present for periodic gratings. Moreover, a survey was given to some recent research on the bidirectional reflectance of anisotropic surfaces. An important area that was not covered is light scattering by small particles and colloids for which there have been tremendous interests and extensive studies. The heat transfer community is very familiar with light scattering and radiative transfer in scattering and absorbing media. Recently, more and more studies on light scattering have employed rigorous treatments of the electromagnetic wave scattering in random media, considering polarization and surface plasmon resonance. Further discussions on evanescent waves, surface waves including surface plasmon and phonon polaritons, and near-field energy transfer by electromagnetic waves will be given in the next chapter.

RADIATIVE PROPERTIES OF NANOMATERIALS

373

REFERENCES 1. R. Siegel and J. R. Howell, Thermal Radiation Heat Transfer, 4th ed., Taylor & Francis, New York, 2002. 2. E. D. Palik (ed.), Handbook of the Optical Constants of Solids, Vols. I, II, and III, Academic Press, San Diego, CA, 1998. 3. O. S. Heavens, Optical Properties of Thin Solid Films, Dover Publications, New York, 1965. 4. Z. Knittl, Optics of Thin Films, Wiley, New York, 1976. 5. M. Q. Brewster, Thermal Radiative Transfer and Properties, Wiley, New York, 1992. 6. Z. M. Zhang, “Reexamination of the transmittance formulae of a lamina,” J. Heat Transfer, 119, 645–647, 1997; Z. M. Zhang, “Optical properties of a slightly absorbing film for oblique incidence,” Appl. Opt., 38, 205–207, 1999. 7. L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, Cambridge, UK, 1995. 8. G. Chen and C. L. Tien, “Partial coherence theory of thin film radiative properties,” J. Heat Transfer, 114, 636–643, 1992. 9. K. Fu, P.-f., Hsu, and Z. M. Zhang, “Unified analytical formulation of thin-film radiative properties including partial coherence,” Appl. Opt., 45, 653–661, 2006. 10. B. J. Lee, V. P. Khuu, and Z. M. Zhang, “Partially coherent spectral radiative properties of dielectric thin films with rough surfaces,” J. Thermophys. Heat Transfer, 19, 360–366, 2005. 11. J. M. Vaughan, The Fabry-Perot Interferometer: History, Theory, Practice and Applications, Adam Hilger, Bristol, PA, 1989. 12. A. R. Kumar, V. A. Boychev, Z. M. Zhang, and D. B. Tanner, “Fabry-Perot resonators built with YBa2Cu3O7-d films on Si substrates,” J. Heat Transfer, 122, 785–791, 2000. 13. P. Yeh, Optical Waves in Layered Media, Wiley, New York, 1988; P. Yeh, A. Yariv, and C. S. Hong, “Electromagnetic propagation in periodic stratified media. I. General theory,” J. Opt. Soc. Am., 67, 423–438, 1977. 14. C. L. Mitsas and D. I. Siapkas, “Generalized matrix method for analysis of coherence and incoherent reflectance and transmittance of multilayer structures with rough surfaces, interfaces, and finite substrates.” Appl. Opt., 34, 1678–1683, 1995. 15. B. J. Lee and Z. M. Zhang, “Rad-Pro: effective software for modeling radiative properties in rapid thermal processing,” in Proc. 13th IEEE Annu. Int. Conf. Adv. Thermal Processing of Semiconductors (RTP’2005), pp. 275–281, Santa Barbara, CA, October 4–7, 2005. 16. J. D. Joannopoulos, R. D. Meade, and J. N. Winn, Photonic Crystals, Princeton University Press, Princeton, NJ, 1995. 17. K. Sakoda, Optical Properties of Photonic Crystals, Springer-Verlag, Berlin, 2001. 18. J. G. Fleming, S. Y. Lin, I. El-Kady, R. Biswas, and K. M. Ho, “All-metallic three-dimensional photonic crystals with a large infrared bandgap,” Nature, 417, 52–55, 2002; C. H. Seager, M. B. Sinclair, and J. G. Fleming, “Accurate measurements of thermal radiation from a tungsten photonic lattice,” Appl. Phys. Lett., 86, 244105, 2005. 19. H. A. Macleod, Thin Film Optical Filters, 3rd ed., Institute of Physics, Bristol, UK, 2001. 20. D. Maystre (ed.), Selected Papers on Diffraction Gratings, SPIE Milestone Series 83, The International Society for Optical Engineering, Bellingham, WA, 1993. 21. R. Petit (ed.), Electromagnetic Theory of Gratings, Springer, Berlin, 1980. 22. M. G. Moharam, E. B. Grann, D. A. Pommet, and T. K. Gaylord, “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A, 12, 1068–1076, 1995; M. G. Moharam, D. A. Pommet, E. B. Grann, and T. K. Gaylord, “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: Enhanced transmittance matrix approach,” J. Opt. Soc. Am. A, 12, 1077–1086, 1995. 23. L. F. Li, “Use of Fourier series in the analysis of discontinuous periodic structures,” J. Opt. Soc. Am. A, 13, 1870–1876, 1996. 24. Y. B. Chen, Z. M. Zhang, and P. J. Timans, “Radiative properties of pattered wafers with nanoscale linewidth,” J. Heat Transfer, 129, 79–90, 2007.

CHAPTER 9

374

25. P. Beckmann and A. Spizzichino, The Scattering of Electromagnetic Waves from Rough Surfaces, Artech House, Norwood, MA, 1987. 26. Q. Z. Zhu and Z. M. Zhang, “Anisotropic slope distribution and bidirectional reflectance of a rough silicon surface,” J. Heat Transfer, 126, 985–993, 2004; Q. Z. Zhu and Z. M. Zhang, “Correlation of angle-resolved light scattering with the microfacet orientation of rough silicon surfaces,” Opt. Eng., 44, 073601, 2005. 27. H. J. Lee, Y. B. Chen, and Z. M. Zhang, “Directional radiative properties of anisotropic rough silicon and gold surfaces,” Int. J. Heat Mass Transfer, 49, 4482–4495, 2006. 28. Y. H. Zhou and Z. M. Zhang, “Radiative properties of semitransparent silicon wafers with rough surfaces,” J. Heat Transfer, 125, 462–470, 2003; H. J. Lee, B. J. Lee, and Z. M. Zhang, “Modeling the radiative properties of semitransparent wafers with rough surfaces and thin-film coatings,” J. Quant. Spectros. Radiat. Transfer, 93, 185–194, 2005. 29. Q. Z. Zhu, H. J. Lee, and Z. M. Zhang, “Validity of hybrid models for the bidirectional reflectance of coated rough surfaces,” J. Thermophys. Heat Transfer, 19, 548–557, 2005. 30. Y. J. Shen, Q. Z. Zhu, and Z. M. Zhang, “A scatterometer for measuring the bidirectional reflectance and transmittance of semiconductor wafers with rough surfaces,” Rev. Sci. Instrum., 74, 4885–4892, 2003. 31. H. J. Lee, A. C. Bryson, and Z. M. Zhang, “Measurement and modeling of the emittance of silicon wafers with anisotropic roughness,” Proc. 16th Symp. Thermophys. Properties, Boulder, CO, July 30–August 4, 2006.

PROBLEMS A greenhouse looks like a small glass house used to grow plants in the winter. Based on the transmittance curve of fused silica (SiO2), shown in Fig. 9.2, explain why glass walls can keep the plants warm in the winter. Discuss the greenhouse effect in the atmosphere. What gases are responsible for the greenhouse effect? 9.2. Calculate the transmittance T, the reflectance R, and the absorptance A of a thick (without considering interference) silicon wafer (0.5 mm thick) at normal incidence. Plot T, R, and A versus wavelength, in the range from 2.5 to 25 m. The refractive index and the extinction coefficient of the doped silicon are given in the following table: 9.1.

Optical Constants of a Doped Silicon Wafer Wavelength l (m) 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0

Refractive index n 3.44 3.43 3.42 3.42 3.42 3.42 3.42 3.42 3.42 3.42

Extinction coefficient k 0 1.0 107 8.4 105 2.1 104 4.0 104 5.0 104 9.0 104 1.0 103 1.1 103 1.3 103

Calculate and plot the transmittance and reflectance for the same silicon wafer described in Problem 9.2 at l 5 m as functions of the polar angle u. Consider the individual polarizations and their average. Compare your results with those by Zhang et al. (Infrared Phys. Technol., 37, 539, 1996). 9.4. Using data from the table in Problem 9.2, calculate and plot the normal transmittance of a 100-m-thick silicon wafer, near 10-m wavelength, considering interference. 9.3.

RADIATIVE PROPERTIES OF NANOMATERIALS

375

(a) Plot the transmittance in terms of wavelength (m) with an interval between the data spacing of 0.05 and 0.005 m, respectively, on one graph. (b) Plot the transmittance in terms of wavenumber (cm–1) with an interval between the data spacing of 5 and 0.5 cm–1, respectively, on one graph. (c) What is the fringe-averaged transmittance at 10-m wavelength? (d) What is the free spectral range in wavenumber and in wavelength? How will n and l change if the wavelength l is changed to 20 m? 9.5. For gold, the refractive index at l 0.5 m is n 0.916 i1.84, and at l 2.0 m is n 0.85 i12.6. Calculate the transmittance of a free-standing gold film at these wavelengths for d 10, 20, 50, and 100 nm, using both Eq. (9.10) and Eq. (9.11). Which equation gives the correct results, and why? 9.6. For the three-layer structure shown in Fig. 9.3, calculate the normal reflectance for n1 1.45 (glass), n2 1 (air gap), and n3 2 (substrate) without any absorption at l 1 m. Plot the reflectance as a function of the air-gap width d. Obtain the analytical formulae of the reflectance maximum and minimum. 9.7. Assume that glass has a refractive index of 1.46 without any absorption in the visible spectrum (0.4 m l 0.7m). Design an antireflection coating (for normal incidence) that will minimize the reflectance from a semi-infinite glass. You need to determine the coating thickness and the refractive index (assuming it is independent of wavelength). Plot the normal reflectance of the coated glass surface in the spectral range from 0.4 to 0.7 m. What material would you recommend for use with the desired property? 9.8. To evaluate the effect of antireflection coating for oblique incidence, assume the antireflection coating has a refractive index of 1.21 and a thickness of 114 nm. What will be the reflectance, at 45 and 60, for each polarization? 9.9. While the extinction coefficient is often related to absorption or loss, it should be noted that when k W n, it is the real part of the refractive index that is related to the loss. This is because the dielectric function can be expressed as e er ies (n2 k2) i2nk, where e is related to the dissipation. For a semi-infinite medium, a purely negative dielectric function means perfect reflection. The effect of n on the absorption by a thin film can be studied by considering a thin film of thickness d with a complex refractive index n2 n ik. For a wavelength of l 0.5 m and at normal incidence, let d 30 nm and k 3.0. Plot the transmittance, the reflectance, and the emittance (which is the same as the absorptance), against the refractive index n ranging from 0.01 to 2. Discuss the effect of n on the absorption. 9.10. Use the dielectric function of SiC given in Example 8-7 to calculate the normal emittance for a SiC film at wavelengths from 9 to 15 m, for different film thicknesses: d 1, 10, 100, and 1000 m. Assume the multiply reflected waves to be perfectly coherent. 9.11. Calculate the emittance as a function of the emission angle for a doped silicon wafer of 200-m thickness, at l 20 m with n2 3.42 i0.001. Consider p and s polarizations separately, and then, take an average. Assume the multiply reflected waves to be perfectly coherent. 9.12. This problem concerns the transmission and reflection of infrared radiation of a YBCO (YBa2Cu3O7) film on a thin MgO substrate of 325-m thickness, at 300 K and normal incidence. For the YBCO film, use the properties for sample A from Kumar et al. (J. Heat Transfer, 121, 844, 1999). For MgO, use the Lorentz model in Problem 8.26. (a) Plot the radiation penetration depth of the YBCO film, df (l), and that of MgO, ds (l), for 1 m l 1000 m. (b) Neglecting the interference effect in the MgO substrate, calculate and plot the transmittance T, the film-side reflectance Rf, and the back-side reflectance Rs, for 1m l 1000 m, with different film thicknesses: 0, 30, 48, 70, and 400 nm. Plot T, Rf, and Rs in terms of both wavelength (m) and wavenumber (cm1). (c) Repeat the previous calculation, considering the interference effects in the MgO substrate, for 200 m l 1000 m (50 to 10 cm1). Plot in terms of the wavenumber only. What happens with the interference fringes when the film thickness is 48 nm? 9.13. Calculate the normal transmittance of a 10-m film with a refractive index n 2.4 without any absorption in the spectral range from 1000 to 3000 cm1. One surface of the film is polished, and the other surface has a roughness srms of 0.10 m. How does the srms value affect the transmittance? Compare your result with that shown in Fig. 9.10.

376

CHAPTER 9

9.14. Reproduce Example 9-2 and Fig. 9.10. Suppose the coherence spectral width dn 1.5 n,

where n is the free spectral range. Determine the fringe-averaged transmittance. Explain why the peaks and the valleys flip after fringe averaging. 9.15. Calculate and plot the transmittance of a Fabry-Perot resonance cavity, assuming the medium to be lossless with n2 2, d2 100 m, and R 0.9, for normal incidence in the wavenumber region from 950 to 1050 cm1. What are the free spectral range, the FWHM of the peak, and the Q-factor of the resonator? Does the theoretically predicted FWHM match with the plot? 9.16. Group project: A reflectance Fabry-Perot cavity can be constructed by coating a SiO2 film onto a silver substrate first and then a thin silver film onto the SiO2 film. Derive a formula for the reflectance. Based on Kirchhoff’s law, one can calculate the emissivity of the structure. Show that the emissivity exhibits sharp peaks close to unity at specific wavelengths for normal incidence. When the wavelength is fixed, calculate the emissivity versus the polar angle for each polarization. Plot and show that there exist angular lobes in the emissivity of such structures. Hint: Choose the thicknesses of the silver film (on the order of 100 nm) and the SiO2 film (on the order of 3000 nm), and the wavelength (around 1 m). Use the optical constants from Palik.2 9.17. Group project: Develop a Matlab code for the multilayer radiative properties based on the matrix formulation described in the text for both TE and TM waves. Compare your results with those calculated by using Rad-Pro, downloadable from www.me.gatech.edu/~zzhang. 9.18. Group project: Evaluate and plot the band structures of a Bragg reflector made of quarter-wave high- and low-index materials GaAs, n 3.49, and AlAs, n 2.95, around the wavelength of 1064 nm. Optional: Plot the normal reflectance near 1064-nm wavelength with 7, 17, and 27 periods, assuming that the substrate is GaAs. 9.19. Derive Eq. (9.60) and Eq. (9.61). 9.20. Based on Eq. (9.64), show that when the evanescent wave exists, it will decay toward negative z. Change the subscript from I to III, and show that when the evanescent wave exists, it will decay toward positive z. 9.21. Derive Eq. (9.71) and Eq. (9.72). 9.22. Use different effective medium formulations to compute the effective dielectric function for silicon with a filling ratio f 1/16 in air at l 300 nm (n 5.0 and k 4.2), l 400 nm (n 5.6 and k 0.39), l 500 nm (n 4.3 and k 0.073), and l 800 nm (n 3.7 and k 0.0066). 9.23. Consider a grating region consisting of Si, with a filling ratio of 1/6, on a semi-infinite Si substrate. The height of the grating is 50 nm. Calculate the reflectance for normal incidence, using different effective medium formulations at the corresponding wavelengths given in Problem 9.22. Compare your results with those in Fig. 9.19. 9.24. Plot the shadowing function for a Gaussian distribution as a function of the polar angle u for the rms slopes w 0.05, 0.1, 0.2, and 0.3. 9.25. Calculate the BRDFs at l 0.5 and 2 m based on the analytical model for a gold surface (opaque) with a Gaussian roughness statistics. The SDF is given by p(zx ,zy )

zx2 zy2 1 b exp a 2pw 2w2

Use the optical constants from Problem 9.5 and the rms slope w 0.1 and 0.3. 9.26. Comment on the limitations of different analytical models for the BRDF, such as the Rayleigh-

Rice perturbation theory, the Kirchhoff approximation, and the geometric optics approximation.

CHAPTER 10

NEAR-FIELD ENERGY TRANSFER

Near-field optics has played a significant role in nanoscience and nanobiotechnology in the past 20 years and continues to be an active research area, especially when dealing with field localization and resonances in micro/nanostructures, with applications in biochemical sensing and nanolithography. The preceding two chapters have laid the foundation of electromagnetic waves in bulk materials and nanostructures. The present chapter offers a more detailed treatment of the energy transfer by electromagnetic waves in the near field, as well as the coupling between near-field phenomena and far-field characteristics. The applications include nanomanufacturing, energy conversion systems, and nanoelectronics thermal management. Ernst Abbe in 1873 and Lord Rayleigh in 1879 studied the required angular separation between two objects for their images to be resolved. The resolution of a conventional microscope is diffraction limited such that the smallest resolvable distance is approximately 0.5l/n, where l is the wavelength in vacuum and n is the refractive index of the medium. Even with an immersion oil (n < 1.5), the imaging sharpness is rather limited to the order of wavelength. The concept of near-field imaging was first described by Synge (Phil. Mag., 6, 356, 1928). This work elaborated the concept of using subwavelength aperture as small as 10 nm in diameter to introduce light to a specimen (e.g., a stained biological section), placed within 10-nm distance, which could move in its plane with a step size less than 10 nm. By measuring the transmitted light with a photoelectric cell and a microscope, an ultramicroscopic image could be constructed. In a subsequent paper (Phil. Mag., 13, 297, 1932), Synge described the idea of using piezoelectricity in microscopy. Synge’s works, however, were largely unnoticed and the idea of near-field imaging was rediscovered many years later. Ash and Nicholls published a paper (Nature, 237, 510, 1972) entitled “Super-resolution aperture scanning microscope.” This work experimentally demonstrated near-field imaging with a resolution of l/60 using 10-GHz microwave radiation ( 3 cm). In the 1980s, two groups have successfully developed near-field microscopes in the visible region.1,2 The IBM group in Zurich formed the aperture through a quartz tip coated with a metallic film on its sides,1 whereas the Cornell group used silicon microfabrication to form the aperture.2 The fabrication process was later improved by using metal-coated tapered optical fibers. In the early 1990s, Betzig at Bell Labs and collaborators demonstrated single molecule detection and data storage capability of 45 gigabits per square inch.3 Nowadays, near-field scanning optical microscope (NSOM), also known as scanning near-field optical microscope (SNOM), has become a powerful tool in the study of fundamental space- and time-dependent processes, thermal metrology, and optical manufacturing with a spatial resolution of less than 50 nm. NSOM is usually combined with the atomic force microscope (AFM) for highly controllable movement and position sensing. An alternative approach is to use a metallic AFM tip to couple the far-field radiation with the near-field electromagnetic waves in a subwavelength region underneath the tip. This is 377 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

CHAPTER 10

378

the so called apertureless NSOM, which does not require an optical fiber or an aperture. Apertureless tips allow high-intensity laser energy to be focused to nanoscale dimensions for laser-assisted nanothermal manufacturing.4,5 Figure 10.1 illustrates three typical NSOM designs. The first is an aperture-based setup, where a very small opening is formed on an opaque plate and collimated light is incident

Laser ªl

(a) Aperture

ªl

(b) Fiber tip

ªl

(c) Apertureless

FIGURE 10.1 Schematic illustration of different NSOM setups. (a) Aperture on an opaque plate. (b) Aperture at the end of a coated optical fiber. (c) Apertureless metallic tip. The opening or the tip is much smaller than the wavelength . The electric field is highly collimated in the near field within a distance of and diverges as the distance increases.

from the above. The second is based on a tapered optical fiber whose tip serves as an aperture. The third uses an apertureless metallic sharp tip, which reflects (scatters) the incident laser light. All of the three designs have one thing in common. The light is confined to a narrow region whose width may be much less than a wavelength. Furthermore, the electromagnetic field within one wavelength distance is very intense and highly collimated. In the near-field region, evanescent waves dominate. Because the amplitude of an evanescent wave decays exponentially away from the aperture or tip, the far-field, or the radiation field diverges and becomes very weak. Understanding the nature of evanescent waves and the localized fields is essential for the NSOM and other near-field optical devices. Evanescent waves are also essential in energy transfer between adjacent objects, through photon tunneling, and in surface plasmon polaritons or surface phonon polaritons. Polaritons are elementary excitons in solids due to charge oscillations near the interface and can interact strongly with electromagnetic waves. In this chapter, we will first use total internal reflection to introduce evanescent waves, and then discuss polaritons or electromagnetic surface waves. The application to construct coherent thermal emission sources and radiation heat transfer at nanometer distances will be presented afterward.

10.1 TOTAL INTERNAL REFLECTION, GUIDED WAVES, AND PHOTON TUNNELING Total internal reflection occurs when light comes from an optically denser material to another material at incidence angles greater than the critical angle determined by Snell’s law. As discussed in Chap. 8, the amplitude of the reflection coefficient becomes unity at incidence angles greater than the critical angle. Although no energy is transferred from medium 1 to medium 2, there exists an electromagnetic field in the second medium near the surface. This electromagnetic field can store as well as exchange energy with medium

NEAR-FIELD ENERGY TRANSFER

379

1 at any instant of time. The time-averaged energy flux must be zero across the interface. Total internal reflection has important applications in optical fibers and waveguides. When medium 2 is not infinitely extended but a very thin layer sandwiched between the first medium and the third medium (which may be made of the same material as that of medium 1), photons can tunnel through the second medium into the third, even though the angle of incidence is greater than the critical angle. This phenomenon is called photon tunneling, radiation tunneling, or frustrated total internal reflection, and has been studied for over 300 years since Newton’s time. Detailed descriptions of the original experiments and analyses by Isaac Newton can be found from his classical book, Opticks (reprinted by Dover Publications in 1952). The enhanced energy transfer by photon tunneling may have applications in thermophotovoltaic energy conversion devices as well as nanothermal manufacturing using heated AFM cantilever tips. 10.1.1 The Goos-Hänchen Shift Evanescent waves can be illustrated by using the total internal reflection arrangement. Consider a plane wave of angular frequency v incident from a semi-infinite medium 1 to ^ ^ ^ ^ medium 2, as shown in Fig. 10.2a. The wavevector k

1 kxx k1zz, k1 kxx k1zz, and

k1+

k1−

q1

e1, m1 e2, m2

a

x q2

k2 kzz

z (a)

Hy

(b)

FIGURE 10.2 Illustration of total internal reflection. (a) Schematic of the incident, reflected, and transmitted waves at the interface between two semi-infinite media. (b) The magnetic field distribution for a TM wave when total internal reflection occurs.

k2 kxx^ k2z^z, since the parallel wavevector component kx must be the same as required by the phase-matching boundary condition. The magnitudes of the wavevectors are

and

k21 k2x k21z e1m1v2/c2

(10.1a)

k22 k2x k22z e2m2v2/c2

(10.1b)

where e and m are the relative (ratio to those of vacuum) permittivity and permeability, respectively, and c is the speed of light in vacuum (throughout this chapter). Let us assume

CHAPTER 10

380

that the incident wave is p polarized or a TM wave, so that the only nonzero component of the magnetic field is in the y direction. The magnetic field of the incident wave may be expressed as Hi (0, Hy, 0), where Hy(x, y, z, t) Hieik1zz ik1xxivt. For simplicity, let us omit exp(ivt) from now on. Recall that the Fresnel coefficients for a TM wave are defined as the ratios of the reflected or transmitted magnetic field to the incident magnetic field. For example, the Fresnel reflection coefficient is rp

k1z /e1 k2z /e2 Hr Hi k1z /e1 k2z /e2

(10.2)

The field in medium 1 is composed of the incident and reflected fields, and that in medium 2 is the transmitted field. Therefore, Hy Hi

e

(eik1zz rpeik1zz)eikxx for z 0 (1 rp)eik2zzeikxx for z 0

(10.3)

The electric fields can be obtained by applying the Maxwell equations. Similar to Sec. 8.3.1, we can write the electric and magnetic fields in both media as follows:

and

k1z ik1zz ik1zz ikxx )e for z 0 ve1e0 (e rpe Ex µ k2z Hi ik2zz ikxx for z 0 ve2e0 (1 rp)e e Ez µ Hi

kx ve e (eik1zz rpeik1zz)eikxx for z 0 1 0

kx ve e (1 rp)eik2zzeikxx for z 0

(10.4)

(10.5)

2 0

Assume that e’s and m’s are real and furthermore, e1m1 e2m2 0. From Eq. (10.1b), we have k22z e2m2v2/c2 k2x . When 2e2m2 kxc/v 2e1m1, the incidence angle u1 is defined but the refraction angle is not, because k2z becomes imaginary. One can write k2z i2, where 2 2k2x e2m2v2/c2 is a real positive number. In this case, Zrp Z 1 and rp eid ei2a

(10.6)

where tan a (2/e2)/(k1z /e1). Following Haus,6 the magnetic field at x 0 in medium 1 can be written as Hy 2Hieia cos (k1zz a), z 0

(10.7a)

Similarly, in medium 2, Hy becomes Hy 2Hieia cos (a)e2z, z 0

(10.7b)

The magnetic field at x 0 is plotted in Fig. 10.2b with respect to kz z, at the instance of time when the phase of Hieiavt becomes zero. From this figure, one can see that the field decays exponentially in medium 2. As a result of total internal reflection, there is a phase shift in medium 1 so that the maximum field is shifted from the interface to kz z a. The phase angle of the reflection coefficient d 2a is called the Goos-Hänchen phase shift, which depends on the incidence angle u1 or kx. The difference in d for TE and TM waves in a dielectric prism was used to construct a polarizer called Fresnel’s rhomb, which can change a linearly polarized wave to a circularly polarized wave, or vice versa.7

NEAR-FIELD ENERGY TRANSFER

381

Calculate the time-averaged Poynting vector near the interface in the case of total internal reflection. Example 10-1.

Based on Example 8-1, it can be seen that the Poynting vector S Re(E) Re(H) is in general a function of time. The time-dependent terms that oscillate with 2v, however, become 1 zero after integration. The time-averaged Poynting vector is kSl 2 Re(E H*). For z 0, 1 * kSz l 2 Re(ExHy ) 0 because k2z is purely imaginary. It can also be shown that kSz l 0 for z 0 (see Problem 10.2). Furthermore, Solution.

kSx l 12Re(EzH*y) µ

kx 2 ve1e0 ZHi Z [1 cos (2k1zz 2a)], z 0 kx 2 22z , z0 ve2e0 ZHi Z [1 cos(2a)]e

(10.8)

Note that kSx l does not have to be continuous at the interface. Depending on whether e is positive or negative, the sign of kSx l may be the same as or opposite to kx. It should also be noted that kSx l is a sinusoidal function of z in medium 1 and decays exponentially in medium 2 as z approaches infinity.

Newton conjectured that, for the total internal reflection of light by the boundary, the beam of light would penetrate some distance into the optically rarer medium and then reenter the optically denser medium. In addition, he suspected that the path of the beam would be a parabola with its vertex in the rarer medium and, consequently, the actual reflected beam would be shifted laterally with respect to the geometric optics prediction. From the Poynting vector formulation given in Eq. (10.8), the energy must penetrate into the second medium to maintain the energy flow parallel to the interface and reenter the first medium so that no net energy is transferred across the interface. The actual beams have a finite extension so that the reflected beam in the far field can be separated from the incident beam since the Poynting vector is parallel to the wavevector. The effect of the parallel energy flow indeed causes the reflected beam to shift forward from that expected by the geometric optics analysis. Goos and Hänchen were the first to observe the lateral beam shift through a cleverly devised experiment [Ann. Physik, 6(1), 333, 1947; 6(5), 251, 1949]. A schematic of this experiment is shown in Fig. 10.3, in

FIGURE 10.3 Illustration of the Goos-Hänchen experiment.

CHAPTER 10

382

which a glass plate was used so that the incident light was multiply reflected by the top and bottom surfaces. In the middle of one or both of the surfaces, a silver strip was deposited. This way, the beam reflected by the silver film (solid line) would essentially follow geometric optics and that by total internal reflection would experience a lateral shift. Although the lateral shift is on the order of the wavelength, a large number of reflections (over 100 times) allowed the shift to be observed by a photographic plate. Lotsch published a series of papers on the comprehensive study of the Goos-Hänchen effect.8 Puri and Birman provided an elegant review of earlier works, including several methods for analyzing the Goos-Hänchen effect.9 A quantitative study of the GoosHänchen effect is presented next. One way to model the lateral shift is to use a beam of finite width rather than an unbounded plane wave. Another method that is mathematically simpler considers the phase change of an incoming wave packet, which is composed of two plane waves with a slightly different kx. Upon total internal reflection, the phase shift d 2a for a given polarization is a function of kx. The difference in the phase shift will cause the reflected beam to exhibit a lateral shift along the interface (x direction) given as D

k21z 22 e1 2kx dd e 2 dkx 2 2k1z k ( e /e )2 1z 2 1 2

(10.9)

where we have used a tan1(2e1/k1ze2). In formulating Eq. (10.9), kx is always taken as positive. Equation (10.8) clearly suggests that kSx l and kx have the same sign when the permittivity is positive and different sign when the permittivity is negative.10 When e1 and e2 have different signs, the lateral shift D will be negative, which implies that the lateral shift is opposite to kSlx of the incident beam. For a TE wave, one can simply replace e’s by m’s in Eq. (10.9). For two dielectrics, we have m1 m2 1, e1 n21, and e2 n22, where n1 and n2 are the refractive indices of medium 1 and 2, respectively. Consequently, Eq. (10.9) reduces to the following: Ds and

Dp

2 tan u1 2

for a TE wave

2 tan u1 2 Qn21 sin 2u1 >n22 cos 2u1 R

for a TM wave

(10.10a) (10.10b)

At grazing incidence, k1z S 0, however, the shift in the direction parallel to the beam is D cos u1 (2/2)(e2/e1) sin u1, which approaches a finite value and does not diverge. At the critical angle, u1 uc sin 1(n2/n1), 2 d 0, and D approaches infinity. This difficulty can be removed by using the Gaussian beam incidence.11 Quantum mechanics has also been applied to predict the lateral beam shift.8 The Goos-Hänchen effect also has its analogy in acoustics and is of contemporary interest in dealing with negative index materials, waveguides, and photon tunneling.10,12,13 10.1.2 Waveguides and Optical Fibers Optical fibers and waveguides are essential for optical communication and optoelectronics. There are numerous other applications such as noncontact radiation thermometry, nearfield microscopy, and decoration lightings. According to a report in 2000, the total length of optical fiber wires that had been installed worldwide exceeded 3.0 1011 m, which equals the distance of a round trip from the earth to the sun. Optical fibers usually operate based on the principle of total internal reflection, as shown in Fig. 10.4. The fiber core is usually surrounded by a cladding material with a lower refractive index.

NEAR-FIELD ENERGY TRANSFER

383

FIGURE 10.4 Schematic of a planar dielectric waveguide.

The numerical aperture NA is defined according to the half angle uh of the acceptance cone, within which total internal reflection occurs. It can be seen from Fig. 10.4 that NA sinuh n1cosuc 2n21 n22

(10.11)

For example, if n1 1.53 and n2 1.46, the critical angle uc 72.6, the maximum cone angle uh 27, and NA 0.46. There are different types of waveguides, such as graded-index waveguides and metallic waveguides, in addition to the simple dielectric type. The cross section may be circular, annular, rectangular, or elliptical. In some cases, the diameter of the fiber is much greater than the wavelength and the electromagnetic waves inside the fiber are incoherent. These devices are sometimes called lightpipes, which are used for relatively short distances. Optical fibers in communication technology use very thin wires and transmit light with well-defined modes. In the following, the configuration of a 1-D dielectric slab between two media will be discussed to illustrate the basics of an optical waveguide. More detailed treatments can be found from the texts of Haus6 and Kong.7 The present author was fortunate to learn optoelectronics and the electromagnetic wave theory through graduate courses taught by these professors. Consider the planar structure shown in Fig. 10.4 that is infinitely extended in the y direction. When the variation of d along the x direction is negligibly small compared to the wavelength, the electromagnetic waves inside the waveguide are coherent. A standing wave pattern must be formed in the z direction. This requires the phase change in the z direction, for the round trip including two reflections at the boundary, to be a multiple of 2p, i.e., 2k1zd 2d 2mp, m 0,1,2,c

(10.12)

where k1z (v/c)n1 cos u1, and the phase shift upon total internal reflection is d 2a 2tan1 ag

2 sin 2u1 sin 2uc b cos u1

(10.13)

where g 1 for TE waves and g n21/n22 for TM waves. The solutions of Eq. (10.12) give discrete values of u1 or kx (v/c)n1 sin u1, at which waves can propagate through the fiber for a prescribed frequency. These are called guided modes of the optical fiber, and Eq. (10.12) may be regarded as the mode equation. The orders of mode are identified as TE0,TE1,c, TEm or TM0,TM1,c, TMm for a 1-D waveguide. For a 2-D waveguide, the subscripts consist of two indices “ml” for each mode. As

CHAPTER 10

384

u1 decreases from p/2 to uc, k1z increases and higher-order modes can be excited. One might wonder why u1 p/2 or kx1 k1 is not a guided mode. In this case, energy would go through the core, cladding, and air in a straight line. Any bending in the waveguide would result in some loss of energy transfer. On the other hand, the guided modes are much less affected by the bending. This is why an optical fiber can transfer signals to a very long distance while being flexible. To illustrate the solution in terms of k1zd , let us rearrange Eq. (10.12) as follows: tan a

k1zd (k1d)2 (k2d)2 2 mp 1 b tan a g g Å 2 2 k1z (k1zd)2

(10.14)

The left and right sides of Eq. (10.14) can be plotted in the same graph against k1zd , as shown in Fig. 10.5, for two values of vd, assuming v2d2 v1d1. The dash-dotted curves

FIGURE 10.5 Solutions of the mode equation, when v 2d 2 v 1d 1. The circles indicate the intersections between the curves described by the left and right sides of Eq. (10.14).

are for TE waves, and the dotted curves are for TM waves. The intersections within the circles identify the guided modes. It is noted that fewer modes are permitted with a smaller vd or d/l. In the graph with v1d1, the possible modes are TE0, TE1, TM0, and TM1 only. A fiber that supports only a single mode for a given frequency is called a single-mode fiber; otherwise, it is called a multimode fiber. Determine the range of d/l so that only the TE0 and TM0 waves are guided in the planar waveguide with n1 1.55 and n2 1.42. Moreover, if d/l 1000, how many TE and TM modes may be guided?

Example 10-2.

Because d/l must be small enough so that the right-hand side of Eq. (10.14) becomes zero at k1zd p, we have (k1d)2 (k2d)2 p2, or 4p2(n21 n22)(d/l)2 p2. Finally, we find d/l 0.5 (n21 n22)1/2 1.3. Moreover, from Fig. 10.5, we can estimate the highest-order mode Solution.

NEAR-FIELD ENERGY TRANSFER

385

M using k1zd Mp and cos u1 cos uc when d W l. Hence, 2p(d/l) cos uc Mp, or M 2(d/l) cos uc 801.8. There will be 802 TE modes and 802 TM modes including the zerothorder modes.

Next, we will study the fields in a planar waveguide. Let us take a TE wave and write in the more general terms e1, m1, e2, and m2. The electric field is nonzero only in the y direction, and the y-component of the electric field is given by Ey • (Ae

ik1zz

Ce2zeikxx, z 0

Beik1zz)eikxx, 0 z d De2(zd)eikxx, z d

(10.15)

where the time-harmonic term exp(ivt) is again omitted for simplicity. The magnetic fields can be obtained as Hx (ivm1m0)1('Ey /'z) and Hz (ivm1 m0)1 ('Ey /'x). There are four boundary conditions for the tangential components to be continuous at z 0 and z d. We end up with a set of homogeneous linear equations of the coefficients A, B, C, and D. The solution exists only when the determinant of the characteristic 4 4 matrix becomes zero and can be expressed in a combined equation as follows: tan (k1zd)a

k21z e21

22 e22

k1z2 b 2a e e b

(10.16)

1 2

This is an equivalent expression of the mode equation. An easier way to solve Eq. (10.15) is by considering the condition of total internal reflection at the boundaries, i.e, A Beid and B Aei(2k1zd d)

(10.17)

i(2k1zd 2d)

The combination gives e 1, which is nothing but Eq. (10.12). After substituting A Bei2a into Eq. (10.15), boundary conditions require that k1zd mp 2z ikxx be e , z 0 2 2 k1zd mp b(k1zz a)eikxx, 0 z d Ey f2eiaBcosak1zz 2 2 k1zd mp 2(zd) ikxx 2eiaBcosa e , zd

be 2 2 2eiaBcosa

(10.18)

Figure 10.6a shows the electric field distribution for TE0, TE1, and TE2. The decaying fields inside the cladding are clearly demonstrated. For a cladding with the conductivity

z=d

z=d TE0 Ey

TE1 Ey

z=0

TE2

TE1

Ey

Ey

TE2

TE3 Ey

Ey

z=0 (a) Dielectric cladding

(b) Conductive cladding

FIGURE 10.6 Electric field distribution E y(z) in planar waveguides. For the conducting cladding, s and the lowest-order TE mode is the first order.

S`

386

CHAPTER 10

s S ` , the waves will be perfectly reflected at the interface without any phase shift and the electric field must vanish in the cladding. Only the odd m’s are guided modes. The first guided mode is TE1, and the guided mode TEq corresponds to q (m 1)/2, with m 1, 3, 5, c The electric fields for the conducting waveguide modes TE1, TE2, and TE3 are shown in Fig. 10.6b for comparison with those for the first three modes in the dielectric waveguide. The difference lies in that no fields can penetrate into the conducting waveguide, whereas the fields can penetrate into the dielectric cladding. Determine the energy flux, phase velocity, and group velocity of the electromagnetic waves in a planar dielectric waveguide.

Example 10-3.

Obviously, there is no net energy flow in the z direction, and kSlx 21Re(EyH*z EzH*y). The second term on the right becomes zero for a TE wave; thus, kSlx (kx /2vm0 )Ey Ey*. Integration of kSlx from z ` to ` gives the power transmitted per unit length in the y direction. Note that a small portion of energy is transmitted through the cladding. The phase velocity along the x direction is vp v/kx c/(n1 sinu1). The group velocity for a given mode is given by vg (dkx /dv)1, which requires the solution of Eq. (10.16) accounting for the frequency-dependent refractive index. Solution.

In Chap. 9, we introduced the concept of Fabry-Perot resonant cavities. Two- and threedimensional optical cavities and microwave cavities support resonance modes, which are standing waves within the cavity. These devices are important for photonics and optoelectronics. Microcavities have also been used to modify the surface radiative properties. The quality factor, or the Q-factor, of a resonator is defined as the ratio of energy storage to the energy dissipation. High Q-factors can be achieved with the microfabricated microcavities for quantum electrodynamics (QED), enhancement and suppression of spontaneous emission, and biological and chemical sensing.14 A special microcavity is made of spheres or disks, where the resonance is built up around a circumference in the form of a polygon. Total internal reflection traps the light inside the microsphere or the disk. At a particular wavelength, when resonance occurs, light undergoes multiple reflections, and a strong electric field which is confined near the perimeter can be built. This is the so-called whispering gallery mode (WGM), named after the whispering gallery at St. Paul’s Cathedral in London. A whispering gallery is a circular gallery under a dome where whispers can be heard from the opposite side of the building. Optical fibers or waveguides are commonly used to couple the photon energy to or from the microcavities via evanescent waves. Ultrahigh Q-factors can be achieved with WGMs. The energy coupling mechanisms have recently been studied by Guo and Quan using a finite-element method.15 A recent development in fiber optics is the use of photonic crystals (PCs) to confine the light into a fiber, whose cladding region is made of PCs, rather than a solid low-index material. The fiber core may be either solid or hollow, and the PCs in the cladding region may contain air-filled holes in silica. For this reason, these fibers are called photonic crystal fibers (PCFs), and some are called holey fibers.16 In the stop band, waves cannot propagate inside the PC and thus effectively confine the propagating wave to the core region, where the modes can be guided, without using total internal reflection. One of the advantages of PCFs over conventional optical fibers is the spectral broadening that enables high-intensity pulses to be transmitted with less distortion or loss of the spectral information, which have important applications such as optical coherence spectroscopy and tomography. Another advantage is that the use of large guiding areas can provide low-loss high-power delivery for imaging, lithography, and astronomy. Other potential applications range from birefringence and nonlinear optics to atomic particle guidance.16 10.1.3 Photon Tunneling by Coupled Evanescent Waves In the preceding sections, we clearly demonstrated that an evanescent wave exists inside the optically rarer medium, which can be air or vacuum, and decays exponentially away

NEAR-FIELD ENERGY TRANSFER

387

from the surface. Furthermore, the evanescent wave or field does not carry energy in the direction normal to the interface. On the other hand, if another optically denser medium is brought to close proximity of the first medium, as shown in Fig. 10.7, energy can be 1 q1

B

Medium 1 Propagating waves

r

A

Medium 2 Evanescent waves

d z

t

Medium 3 Propagating wave

0.8 Transmittance, T'l

1

q1 = 10° q1 = 40°

0.6 0.4

q1 = 50°

0.2 0

n2 = 1, qc = 45°

q1 = 80° 0

0.4

n1 = n3 = 1.414

0.8

1.2

1.6

d/l (a) Tunneling setup

(b) Transmittance for TE waves

FIGURE 10.7 Illustration of photon tunneling. (a) Schematic drawing of the three layers and fields. (b) Calculated transmittance for a TE wave, assuming n 1 n 3 1.414 and n 2 1 . Note the distinct differences between the interference effect and the photon tunneling phenomenon, where the transmittance decreases with increasing d and becomes negligibly small for d l .

transmitted from the first to the third medium, even though the angle of incidence is greater than the critical angle. This phenomenon, known as frustrated total internal reflection, photon tunneling, or radiation tunneling, is very important for energy transfer between two bodies when the distance of separation is shorter than the dominant wavelength of the emitting source. Frustrated total internal reflection has been known since Newton’s time and was theoretically investigated by Hall (Phys. Rev. Ser. I, 15, 73, 1902). Cryogenic insulation is a practical example when photon tunneling may be significant.17 Advances in micro/nanotechnologies have made it possible for the energy transfer by photon tunneling to be appreciable and even dominant at room temperature or above. This may have applications ranging from microscale thermophotovoltaic devices to nanothermal processing and nanoelectronics thermal management.18–20 While photon tunneling is analogous to electron tunneling, through a potential barrier, which may be explained by quantum mechanics, it can be understood by the coupling of two oppositely decaying evanescent waves.21 Because of the second interface, a backwarddecaying evanescent wave is formed inside layer 2, the optical rarer medium. The Poynting vector of the coupled evanescent fields has a nonzero normal component, suggesting that the energy transmission between the media is possible as long as the gap width is smaller than the wavelength. Beyond this wavelength, the field strength of the forward-decaying evanescent wave is too low when it reaches the second interface and the reflected evanescent field is negligible. The matrix formulation discussed in Chap. 9 can be used to calculate the transmittance and the reflectance through the gap (i.e., medium 2) as if there were propagating waves. To illustrate this, consider all three layers are dielectric. Taking the TM wave incidence as an example, let us write the magnetic field inside medium 2 as follows: Hy(x, z) (Aeik2zz Beik2zz)eikxx, 0 z d

(10.19)

where A and B are determined by the incident field and boundary conditions. When two waves are combined, the Poynting vector of the field kSl 12Re[(E1 E2) (H*1 H*2)] has four terms. Two of them can be associated with the power flux of each individual wave,

CHAPTER 10

388

while the other two represent the interaction between the waves. After simplification, the normal component of the Poynting vector can be expressed as kSz l and

k2z Q ZAZ 2 ZBZ 2 R , when k22z k22 k2x 0 2ve2e0

2 kSz l ve e Im(AB*), when

22 k22z k2x k22 0

(10.20a) (10.20b)

2 0

Because there is no loss or absorption, kSz l is independent of z in medium 2, and the ratio of kSz l in medium 2 to that of the incidence in medium 1 is the transmittance. When propagating waves exist in medium 2 or the angle of incidence is smaller than the critical angle, interference will occur and the energy flux in the z direction can be represented by the forward- and backward-propagating waves, see Eq. (10.20a). The transmittance oscillates as the thickness of medium 2 is increased. When evanescent waves exist in medium 2 at incidence angles greater than the critical angle, the transmittance is a decaying function of the thickness of medium 2, as shown in Fig. 10.7b. While the individual evanescent wave does not carry energy, the coupling results in energy transfer, as suggested by Eq. (10.20b). Equation (9.8) through Eq. (9.10), derived in the previous chapter, can be used to calculate the transmittance and the reflectance. These equations are applicable to arbitrary electric and magnetic properties as long as the medium is isotropic and homogeneous within each layer. The phase shift b in these equations is purely imaginary when medium 2 is a dielectric. Assuming that the incident field has an amplitude of 1, determine A and B in Eq. (10.19) for u1 uc sin 1(n2/n1), when all three media are dielectric with n3 n1 n2. Find an expression of the tunneling transmittance using real variables only.

Example 10-4.

The tangential fields can be written as follows for the three-layer structure shown in Fig. 10.7a. Note that 2 2k2x k22 (2pn1 >l) 2 sin 2u1 sin 2uc. Solution.

(eik1zz reik1zz)eikxx, z 0 Hy • (Ae2z Be2z)eikxx, 0 z d teik1zzeikxx, z d k1z (eik1zz reik1zz)eikxx, z 0 vn21e0 i2 Ex f 2 (Ae2z Be2z)eikxx, 0 z d vn2e0 k1z teik1zzeikxx, z d vn21e0

(10.21)

(10.22)

The continuity of tangential fields at the two interfaces allow us to determine t, r, A and B. Note that because the incident field has an amplitude of 1, the preceding equations do not yield a set of homogeneous linear equations as in the case of guided waves. If we use Eq. (10.6) for rp eid, where d 2a and cot (a) (k1z /n21)/(2/n22) for a TM wave, we can rewrite Eq. (9.7) and Eq. (9.8) to obtain the reflection and transmission coefficients as follows:

r

eid(1 e22d) 1 e2ide22d

(10.23)

t

(1 e2id)e2d 1 e2ide22d

(10.24)

NEAR-FIELD ENERGY TRANSFER

389

where we have used the relationship of Fresnel’s coefficients and set the phase shift in Eq. (9.6) to b i2d. After matching the boundary conditions at z d , we have A 0.5t[1 i cot(a)] and

B 0.5t[1 i cot(a)]e2d

(10.25)

It can be shown that the normal component of the Poynting vector is the same in media 2 and 3 (see Problem 10.8). The tunneling transmittance becomes Trl tt* or

2[1 cos (2d)]e22d 1 e42d 2 cos (2d)e22d

Tr l

sin2(d) sin (d) sinh(2d) 2

(10.26a) (10.26b)

Clearly, the tunneling transmittance does not oscillate as d increases; rather, it decreases monotonically from 1 to 0 as d is increased from 0 to infinity. Equation (10.23) through Eq. (10.25) can be applied to TE waves by taking cot (a) k1z /2, which changes the Fresnel reflection coefficient rp to rs because only the dielectric media are considered here. Equation (10.26) is convenient for calculating the tunneling transmittance between dielectrics.

10.1.4 Thermal Energy Transfer between Closely Spaced Dielectrics Energy exchange between closely spaced dielectric plates can be calculated by integrating Planck’s function over all wavelengths as well as over the whole hemisphere using the directional-spectral transmittance. Let us use an example to illustrate the procedure and the effect of photon tunneling and interferences on the near-field thermal radiation. Calculate the hemispherical transmittance between two dielectrics of n1 n3 3, separated by a vacuum gap d (n2 1). Use the results to calculate the radiative energy transfer between the two media, assuming T1 1000 K and T3 300 K.

Example 10-5.

In the far field, we can use the following formula discussed in Chap. 2 (see Example 2-6) to calculate the net radiative heat flux:

Analysis.

sSBT14 sSBT34 1/e1 1/e3 1

qs13,d S `

(10.27)

The hemispherical emissivity of each surface can be evaluated using Eq. (8.86), which can be rewritten as follows, considering that the emissivity is independent of the azimuthal angle f: el,h 2 3

p/2

0

erl(u) cos u sin u du

(10.28)

One could average the directional-spectral emissivity over the two polarizations. However, the preferable way is to calculate the hemispherical emissivity for each polarization and use it to calculate the net heat flux by taking half of Eq. (10.27). The heat fluxes calculated for the two polarizations can then be added to obtain the total heat flux. The results give the far-field limit, which is always smaller than qs13, BB sSB(T41 T43), which is the net radiative heat flux between two blackbodies. This will not be the case in the near field when interference and tunneling effects are important. The hemispherical transmittance can be evaluated in the similar way by integration over the hemisphere. Note that only a small cone of radiation, originated from medium 1, will result in propagating waves in medium 2. This half cone angle is the critical angle, which is uc sin 1(n2 /n1) < 19.5. Thus, we can divide the hemispherical transmittance in two parts to separately evaluate the transmittance. Keeping in mind that the transmittance is defined as the ratio of the transmitted energy to the incident energy, we can sum the two parts to obtain the hemispherical transmittance Solution.

CHAPTER 10

390

(10.29)

Tl, h Tl, prop Tl, evan uc

where

Tl,prop 2 3 Trlcosusinudu

(10.29a)

Tl,evan 2 3 Trlcosusinudu u

(10.29b)

0

p/2

and

c

If n1 2 n3, uc will depend on whether the incidence is from medium 1 or 3, and the resulting hemispherical transmittance will be the same. We can obtain the average transmittance for the two polarizations, as shown in Fig. 10.8a. The propagating wave contribution shows some oscillations but reaches a constant value when d/l S 0 where all waves will be constructively added. At d/1, the constructive and destructive interferences cancel out so that Tl,prop become a constant again. The

106

0.8 0.6

Tl, h Tl, evan

q"13,comb

n1 = 3 n2 = 1

q"13(W/m2)

Transmittance, Tl

1

n3 = 3

0.4 0.2 0 10−3

(a)

d

T3 = 300 K

q"13,evan 105

Tl, prop

10−2

T1 = 1000 K

q"13,prop

10−1 d/l

100

10 −2 10

101

(b)

100

10−1

101

d (µm)

FIGURE 10.8 Radiation heat transfer between dielectric surfaces in close proximity. (a) Contributions to hemispherical transmittance by interference and tunneling, where the transmittance is the average of both polarizations. (b) Net heat flux as a function of the distance of separation.

contribution of evanescent waves becomes important when d/l 1 and starts to dominate over that of the propagating waves when d/ 1. When d/l S 0, the evanescent wave or tunneling contributes to nearly 90% of the transmittance when n1 3. This explains why photon tunneling is very important for the near-field energy transfer. Planck’s blackbody distribution function, given by Eq. (8.44), can be rewritten for each polarization in media 1 and 3, respectively, as eb,l(l,T1)

and

eb,l(l,T3)

n21C1 5

C2/lT1

2l (e

1)

n23C1 2l5(eC2/lT3 1)

(10.30a)

(10.30b)

where in m is the wavelength in vacuum, and C1 3.742 108 W # m4/m2 and C2 1.439 104 m # K are the first and second radiation constants in vacuum. The emissive power in a nondispersive dielectric is increased by a factor of the square of the refractive index, as a result of the increased photon density of states. The factor 2 in the denominator is included because only single polarization has been considered. The net radiation heat flux from medium 1 to 3 is

NEAR-FIELD ENERGY TRANSFER

391

`

qs1S3 3 eb,l(l,T1)Tl,h(l)dl 0

(10.31a)

and that from medium 3 to 1 is `

qs3S1 3 eb,l(l,T3)Tl,h(l)dl 0

(10.31b)

where Tl,h is obtained from Eq. (10.29). Hence, the net radiation heat transfer becomes qs13 qs1S3 qs3S1

(10.32)

One can also separately substitute the hemispherical transmittance of propagating and evanescent waves to Eq. (10.31). Equation (10.32) should be individually applied to TE and TM waves, and then summed together to get the net heat flux. The integration limits can be set such that the lower limit lL 0.1lmp and the upper limit lH 10lmp, where lmp is the wavelength corresponding to the maximum blackbody emissive power at the temperature of the higher-temperature medium as expressed in Eq. (8.45). The calculated results of the near-field radiative transfer are shown in Fig. 10.8b as a function of the separation distance d. Several important observations can be made. 4 4 (a) When d V lmp, the propagating waves result in qs 13,prop sSB(T1 sT3 ) and the evanescent 2 4 4 waves result in qs 13,evan (n1 1)sSB(T1 T 3 ). The combined net radiation heat transfer is 2 4 4 qs 13,comb n1sSB(T1 T3 ). (b) As the distance increases, the evanescent wave contribution goes down monotonically and becomes negligible when d lmp, which is about 3 m. (c) Due to interference effects, the energy transfer by propagating waves decreases slightly as d increases and then reaches the far-field limit, Eq. (10.27), when d lmp.

If the media were conductive, the previous calculations are not appropriate because of the large imaginary part of the refractive index or the dielectric function. In fact, the nearfield radiation heat transfer can be greatly enhanced with the presence of surface waves or if the media are semiconductors.18–20 The treatment requires the knowledge of fluctuational electrodynamics, which will be discussed in Sec. 10.5 at length.

10.1.5 Resonance Tunneling through Periodic Dielectric Layers There exists a photonic analogue of resonance tunneling of electrons in double-barrier quantum well structures. The geometry to illustrate resonance photon tunneling is depicted in Fig. 10.9a, with periodic layers of thicknesses a and b, like the photonic crystal (PC) structure discussed in Sec. 9.3, with a period a b. For tunneling to occur, the doubleprism structure can be used so that light is incident from medium 1 with a refractive index n1. The barrier of thickness b is made of another dielectric with a refractive index n2 that is lower than n1. There are N periods or unit cells in total between the end media. Light is incident at an incidence angle u1 uc sin 1(n2/n1). Yeh performed a detailed analysis of this phenomenon and derived the equation of transmittance,22 which can be expressed as Tlr

1 sinh2(b) sin2(NKA) 1

sin2(d) sin2(K)

(10.33)

where K is the Bloch wavevector of the PC, d is the phase angle upon total internal reflection, and is the imaginary part of the normal component of the wavevector in the lower-index dielectric, as defined in Example 10-4. It can be seen that Eq. (10.33) reduces to Eq. (10.26a) and Eq. (10.26b) for N 1, where the transmittance is 1 at b 0, and decreases monotonically with increasing b.

CHAPTER 10

392

z q1 n1 n2

k

x

1

n1 n2 n1

a b

2

N=2

Transmittance, T 'l

1 0.8

N=5

N=5 0.6 0.4 0.2 0

Λ

0.9606 0.9612 0.9618 0.9624 0.9630 0.9636 λ/b

n2 n1

Transmittance, T'l

n1 N

0.8

N=2

1 N=5

N= 5

0.6 0.4 0.2 0 1.70

1.74

1.78

1.82

l/b

(a)

(b)

FIGURE 10.9 Resonance tunneling. (a) Alternative high-index (n1) and low-index (n 2) multiple dielectric layers for resonance tunneling. (b) Calculated transmittance spectra for N 2 and N 5, at two wavelength regions. Calculation conditions are n 1 3 , n 2 2 , a b/2 , and u 1 45.

The following equation can be used to calculate K: cos(K) cos(k1za)cosh(b) cot(d)sin(k1za)sinh(b)

(10.34)

where k1z is the normal component of the wavevector in medium 1. While cos (K) is real, K is in general complex. However, there exist regions or pass bands where Z cos (K)Z 1 so that K is real. The transmittance does not oscillate in the pass bands, unlike what was shown in Fig. 9.17, where propagating waves exist in both types of dielectrics. Here, evanescent waves exist in the lower-index dielectric layers. However, the transmittance expressed in Eq. (10.33) becomes unity when the following equation holds: sin (NK) 0 sin (K)

(10.35)

The denominator of this equation simply excludes the zeros in sin (NK) for K mp, m 0, 1, 2,c It turned out that in each pass band, there exist (N 1) solutions, with different combinations of v, kx, and the thicknesses a and b. As an example, Fig. 10.9b illustrated the transmittance as a function of l/b when n1 3, n2 2, u1 45 and a/b 0.5. Because of the narrow transmittance peaks, the plot is broken into two panels, each corresponding to a pass band. For N 2, there is only one peak in each pass band, while for N 5, there are four peaks. Yeh showed that the resonance frequencies correspond to the guided modes in the multilayer-waveguide equations.22 Hence, the fields are highly localized near the higher-index layer. Total internal reflection causes very high

NEAR-FIELD ENERGY TRANSFER

393

reflection on the surfaces of the higher-index layer and produces resonances similar to those in a Fabry-Perot cavity resonator. It should be noted that extremely sharp transmittance peaks can be obtained when l is close to the gap thickness b (see the upper panel). Further investigation on resonance tunneling is needed for the application in narrow band-pass filters. Due to the guided modes and the localized field, the magnitude of the evanescent wave may be amplified in the forward direction in some region (see Problem 10.11). Similar to the lateral shift by total internal reflection, due to the parallel energy flow in the high-index layer (waveguide), there must be a lateral shift of the transmitted light for finite beams. Little has been reported in the literature about the beam shift and the field distribution in dielectric multilayer structures, when resonance tunneling occurs. 10.1.6 Photon Tunneling with Negative Index Materials Negative index materials (NIMs), for which the permittivity and the permeability become negative simultaneously in a given frequency region, can also be used to enhance photon tunneling.23 The basics of NIMs has already been presented in Sec. 8.4.6. The structure is illustrated in Fig. 10.10a with a pair of layers in between two prisms. One of the layers has

FIGURE 10.10 Photon tunneling with a layer of NIM. (a) The tunneling arrangement. (b) The field distribution in the middle layers for a TE wave.

a negative refractive index. Assume that one of the layers is vacuum and another has e m 1, so its refractive index is exactly 1. The transmittance becomes unity when the thickness of the NIM layer and that of the vacuum are the same, regardless of the angle of incidence and polarization. Let us use the full notation of e and m without using the refractive index. The transmission coefficient can be expressed as follows:23 t

8 j1eif1 j2eif1 j3eif2 j4eif2

(10.36)

Here, the phase angles f1 and f2 can be expressed as f1 k2zd2 k3zd3

and

f2 k2zd2 k3zd3

(10.37)

CHAPTER 10

394

where d2 and d3 are the thicknesses of layers 2 and 3, and k2z and k3z are the normal component of the wavevector in media 2 and 3, respectively. Note that when tunneling occurs, k2z and k3z become purely imaginary for the lossless case, as will be discussed later. For a TE wave, the coefficients in Eq. (10.36) are

and

j1 a1

k3zm2 k4zm3 k2zm1 b a1

b a1

b k1zm2 k2zm3 k3zm4

(10.38a)

j2 a1

k2zm1 k3zm2 k4zm3 b a1

b a1 b k1zm2 k2zm3 k3zm4

(10.38b)

j3 a1

k2zm1 k3zm2 k4zm3 b a1 b a1 b k1zm2 k2zm3 k3zm4

(10.38c)

j4 a1

k3zm2 k4zm3 k2zm1 b a1 b a1

b k1zm2 k2zm3 k3zm4

(10.38d)

For a TM wave, the transmission coefficient is defined based on the magnetic fields and the coefficients can be easily obtained by substituting e’s for m’s in Eq. (10.38). The sign selection of klz was mentioned in Sec. 9.2.2 in the discussion of the matrix formulation. Basically, when there exist propagating waves in medium l, klz (2pnl /l) 21 (n1/nl)2 sin 2u1, whose sign becomes negative in a NIM. On the other hand, if the waves become evanescent in medium l, we use klz i(2p/l) 2n21 sin 2u1 n2l il. Here, l is always positive in a lossless medium, even in a NIM. Assume that the prisms are made of the same materials so that properties of medium 1 and medium 4 are identical. Furthermore, layer 2 is made of a NIM with index-matching conditions, i.e., e2 e3 and m2 m3 so that n2 n3. Eq. (10.36) can be further simplified. For propagating waves in the middle layers, k2z k3z and j3 j4 0; thus, t

1 cos (k3z ) iY sin (k3z )

(10.39)

where d3 d2, Y 12 (k3z m1/k1z m3 k1z m3/k3z m1) for TE waves, and Y 21 (k3ze1/ k1ze3 k1ze3/k3ze1) for TM waves. Because media 1 and 4 are made of the same material, the transmittance for propagating waves can be written as follows: Trl

1 cos 2(k3z ) Y2 sin 2(k3z )

(10.40)

For evanescent waves, we have k2z k3z i3, where 3 (2p/l) 2n21 sin 2u1 n23. Now that j1 j2 0, Eq. (10.36) can be simplified so that t

1 cosh(3 ) i cot(d)sinh(3 )

(10.41)

where cot(d) 21(3 m1/k1z m3 k1z m3/3 m1), with d being the phase change upon total internal reflection from medium 1 and 2. The transmittance Trl tt* is real and always decreases with increasing , the difference between the layer thicknesses. Although Eq. (10.39) and Eq. (10.41) are identical because sin(ix) i sinh(x) and cos(ix) cosh(x), the use of real

NEAR-FIELD ENERGY TRANSFER

395

variables allows us to observe the variation of transmittance with easily. When tunneling occurs, the field is highly localized near the interface between the NIM and the PIM layers, as shown in Fig. 10.10b for a TE wave, where the fields are sum of the forwarddecaying and backward-decaying evanescent waves. The amplitude of the evanescent wave in the NIM increases in the direction of energy flow. It can be shown that the amplitude will still increase in medium 2, even though the NIM is placed in layer 3 and layer 2 is a vacuum. This corresponds to another resonance effect, which is associated with the excitation of surface electromagnetic waves or surface polaritons, to be discussed in the next section. The directional and hemispherical transmittances for the structure shown in Fig. 10.10a are illustrated in Fig. 10.11 with the following parameters: n1 n4 1.5, n2 1 1.0

1.0

(a)

0.6 0.4

0.8

n = n = 1.5 1

4

TE, 4 5

o

TE, 6 0

o

n = −1

TM, 45

o

n =1

TM, 60

o

Transmittance, T'l

Transmittance, T 'l

(b)

q1:

0.8

2

3

0.2 0.0

0.6 0.4 Tl, prop

0.2

−1.5

−1

−0.5

0 0.5 (d3−d2)/l

1

1.5

Tl, h

d2/l = 1

0.0

0

0.5

Tl, evan

1 d3/l

1.5

2

FIGURE 10.11 Transmittance for a four-layer structure with one middle layer being matching-index NIM. (a) Directional transmittance. (b) Hemispherical transmittance.

(e2 m2 1), and n3 1 (vacuum). Both the directional and hemispherical transmittances become 1 when d3 d2. The hemispherical transmittance has two components due to propagating and evanescent waves. The effects of loss and dispersion have also been examined.24

10.2 POLARITONS OR ELECTROMAGNETIC SURFACE WAVES Surface plasmons, also known as surface plasmon polaritons, play an important role in near-field microscopy, nanophotonics, and biomolecular sensor applications.25–27 Surface plasmon polaritons represent the interaction between electromagnetic waves and the oscillatory movement of free charges near the surface of metallic materials. When surface plasmons are confined to small structures, such as the tip of a scanning microscopic probe, quantum dots or nanoparticles, nanowires, or nanoapertures, they are referred to as localized plasmons. Surface plasmons usually occur in the electromagnetic wave spectrum in the visible or near-infrared region for highly conductive metals such as Ag, Al, and Au. In some polar dielectric materials, phonons or bound charges can also interact with the electromagnetic waves in the mid-infrared spectral region and cause resonance effects near the surface; these are called surface phonon polaritons, which have applications in tuning the thermal emission properties28 and nanoscale nondestructive imaging29 In the following, the basic mechanisms of surface polaritons will be presented, with discussions on some

CHAPTER 10

396

important applications. Emphasis is placed on the quantitative analysis of radiative properties for layered structures. In Sec. 10.4, the superlens concept will be introduced for imaging beyond the diffraction limit, and the energy streamline method will be presented for analyzing the energy propagation direction in the near-field regime. 10.2.1 Surface Plasmon and Phonon Polaritons Plasmons are quasiparticles associated with oscillations of plasma, which is a collection of charged particles such as electrons in a metal or semiconductor. Plasmons are longitudinal excitations that can occur either in the bulk or at the interface. As shown in Fig. 10.12a, the

Hy µ eh1z Air

e1

Hy

e2

Hy

Metal

µ e−h z 2

z (a) Schematic of surface plasmon

(b) The decaying fields

FIGURE 10.12 Illustration of surface plasmon polariton. (a) Charge fluctuations and the magnetic field at the interface between a metal and air. (b) The exponentially decaying field amplitudes away from the interface.

charges oscillate along the surface, and such an excitation is called a surface plasmon or surface plasmon polariton. The field associated with a plasmon is localized at the surface, and the amplitude decays away from the interface, as shown in Fig. 10.12b. Such a wave propagates along the surface, and therefore, it is called a surface electromagnetic wave, similar to surface waves in fluids or the acoustic surface waves. Surface plasmons can be excited by electromagnetic waves and are important for the study of optical properties of metallic materials, especially near the plasma frequency, which usually lies in the ultraviolet. The requirement of evanescent waves on both sides of the interface prohibits the coupling of propagating waves in air to the surface plasmons. For this reason, surface waves are often regarded as nonradiative modes. The attenuated total reflectance (ATR) arrangements are commonly used to excite surface plasmons. When light is incident from the prism, it is possible for evanescent waves to occur simultaneously in the underneath metallic and air layers, as shown in Fig. 10.13, for the two typical configurations named after A. Otto (prism-air-metal) and E. Kretschmann and H. Raether (prism-metal-air). A detailed discussion with historical aspects can be found from Raether.30 In addition to the requirement of evanescent waves on both sides of the interface, the polariton dispersion relations must be satisfied. They are expressed as follows when both media extend to infinity in the z direction: k2z k1z e1 e2 0 for TM wave

(10.42)

k1z k2z m1 m2 0 for TE wave

(10.43)

NEAR-FIELD ENERGY TRANSFER

q ed e1

k1 kx

e2

397

q Prism

ed

Air

e1

Metal

e2

(a) Otto configuration

k1 kx

Prism Metal Air

(b) Kretschmann configuration

FIGURE 10.13 Typical configurations for coupling electromagnetic waves with surface polaritons using attenuated total reflectance arrangements. (a) The Otto configuration (prism-air-metal). (b) The Kretschmann-Raether configuration (prism-metal-air). Note that a polar dielectric may substitute for the metal to excite surface phonon polaritons.

Let us consider lossless media first. In order for evanescent waves to occur, we must have k1z i1 and k2z i2 with 1 and 2 being positive, in order for the field eikxxik1zz eikxx 1z to decay toward z ` and eikxx ik2zz eikxx2z to decay toward z ` . This means that the sign of permittivity must be opposite for media 1 and 2 in order to couple a surface polariton with a TM wave. On the other hand, we will need a magnetic material with negative permeability for a TE wave to be able to couple with a surface polariton. NIMs exhibit simultaneously negative permittivity and permeability in the same frequency region and are sometimes called double-negative (DNG) materials. Therefore, both TE and TM waves may excite surface plasmon polaritons with a NIM, as predicted by Ruppin.31 When compared with Fresnel’s reflection coefficients, as can be seen from Eq. (10.2), the condition for the excitation of surface polaritons is that the denominator of the reflection coefficient be zero. A pole in the reflection coefficient is an indication of a resonance. Very often, the surface plasmon polariton is referred in the literature as a surface plasmon resonance. Taking a TM wave for example, since k21z m1e1v2/c2 k2x and k22z m2e2v2/c2 k2x from Eq. (10.1), we can solve Eq. (10.42) to obtain v m1/e1 m2/e2 kx c Å 1/e21 1/e22

(10.44)

Equation (10.44) relates the frequency with the parallel component of the wavevector and is another form of the polariton dispersion relation. It should be noted that solutions of this equation are for both k1z/e1 k2z/e2 0 and k1z/e1 k2z/e2 0, i.e., not only the poles but also the zeros of the Fresnel reflection coefficient are included. For nonmagnetic materials, Eq. (10.44) becomes e1e2 v kx c Å e1 e2

(10.45)

One should bear in mind that the permittivities are in general functions of the frequency. For a metal with a negative real permittivity, the normal component of the wavevector is purely imaginary for any real kx because mev2/c2 0. Thus, evanescent waves exist in metals regardless of the angle of incidence.

CHAPTER 10

398

Consider either the Otto or Kretschmann-Raether configuration, and use the three-layer structure with a middle layer, medium 1, of thickness d. From Eq. (9.7), the reflection coefficient can be expressed as follows: r

r01 r12e2ib 2ib

1 r01r12e

r01 r12e21d

(10.46)

1 r01r12e21d

where the subscript 0 signifies the incidence medium, which is the prism, and b k1zd i1d. When d is sufficiently large, exp(21d ) V 1, and the reflectance Rrl rr* < r01r*01 is close to unity. When surface polaritons are excited, however, r12 increases dramatically and thus it is possible for r12e21d to be of the same magnitude as r01, but with an opposite phase, i.e., with a phase difference of p. At the condition of surface plasmon resonance, the reflectance Rlr drops suddenly. Let us use an example to illustrate the polariton dispersion curves and the effect on the reflectance in ATR arrangements. Calculate the dispersion relation between Al and air. Calculate the reflectance versus angle of incidence for both the Otto and Kretschmann-Raether configurations at l 500 nm, using Al as the metallic material. Determine the polariton propagation length at the wavelength l 500 nm. Assume the prism is made of KBr with ed 2.46 and the dielectric function of Al can be described by the Drude model. Example 10-6.

The Drude model parameters for Al have been given in Example 8-6. Thus, we have e2(v) 1 v2p/(v2 ivg), where the plasma frequency vp 2.4 1016 rad/s and the scattering rate g 1.4 1014 rad/s. One way to calculate the dispersion relation is to assume v is real and calculate kx(v) krx(v) iksx(v). The dispersion curves between Al and air (eair 1) are usually plotted in a v-kx graph, for the real part of kx shown in Fig. 10.14a by the solid line. At very low frequencies, the magnitude of e2 is so large that kx < v/c. Note that the dash-dotted line with Solution.

2 10 5

1

1.5 10 5

0.8

kx = w/c

wp

Reflectance, R'l

Frequency, w/(2pc) (cm−1)

(a)

1 10 5

5 10 4

kx = (ed) 1/2 w /c

0.6

0

5 10 4

1 10 5

1.5 10 5

Wavevector component, kx /2p (cm−1)

2 10 5

d = 24 nm (Al)

d = 900 nm (Air)

0.4

l = 500 nm 0.2

0

d = 16 nm (Al) d = 12 00 nm (Air)

0 38

ed = 2.46

(b) 39

e 40

Al

= 39. 5 + 1. 5i

41

42

43

Incidence angle, q (deg)

FIGURE 10.14 (a) The dispersion relation of surface plasmon polaritons between Al and air, where kx is the real part solution of Eq. (10.45). (b) Reflectance in ATR arrangements, either with Al or air as the middle layer.

kx v/c represents the light line. On the left of this line, there exist propagating waves in air; whereas on the right of the light line, evanescent waves occur in air because kx v/c. The light line can be considered as a wave travelling in air along the x direction. On the polariton dispersion curve, kx increases quickly as v increases and reaches an asymptote at v vp/!2, when the real part of the dielectric function of Al approaches –1. Between vp/!2 v vp, the real part of the dielectric function of Al becomes negative with an absolute value less than 1. Therefore, the solution of Eq. (10.45) has a large imaginary part, while the real part of kx drops to near zero, as reflected by the

NEAR-FIELD ENERGY TRANSFER

399

bending of the dispersion curve toward left and the steep rise upward. Beyond v vp, metal becomes transparent and the real part of the dielectric function becomes positive. Solutions beyond v vp correspond to zeros in the reflection coefficient and thus are not the solutions for Eq. (10.42), which are poles of the reflection coefficient. Notice that the dotted line refers to the light line of the prism. In the shaded region, there exist evanescent waves in air but propagating waves appear in the prism; as a result, surface plasmons can be coupled to propagating waves in the prism. The reflectance is calculated from Eq. (10.46) at the wavelength l 500 nm, corresponding to a wavenumber of 20,000 cm1. As can be seen from Fig. 10.14a, at this frequency, the surface polariton curve is very close to the light line in air. Therefore, the excitation of surface polariton is expected to be near the critical angle uc < 39.6 between the prism and air. The reflectance would be close to 1 at u uc. However, as shown in Fig. 10.14b, the reflectance drops suddenly around 40 due to the excitation of surface polaritons. Furthermore, the reflectance dips are very sensitive to the thickness of the middle layer. In the Otto configuration, the air thickness of 900 nm yields a sharp dip. For the Kretschmann-Raether configuration, on the other hand, a metallic film thickness of 24 nm yields a sharp dip in the reflectance. If the Al film exceeds 50 nm, the reflectance is close to 1. The locations of the reflectance minimum and the width depend on the thickness of the middle layer. When the surface plasmon polariton is excited, a large absorption occurs in the metal, which results in a coupling of the electromagnetic energy to a surface wave. The propagation length of the surface wave can be determined based on the imaginary part of kx, i.e., ksx . Note that the field can be expressed as eikrx xksx x for surface waves propagating in the positive x direction and as eikrx x ksx x for surface waves propagating in the negative x direction. The power is proportional to the square of the field amplitude, and the (1/e) power decaying length or the polariton propagation length is30

(10.47)

lsp 1/(2ksx )

Plugging into the values in Eq. (10.45), we obtain lsp < 80 m. Note that the Drude model somewhat underpredicts the imaginary part of the dielectric function. If Im(e) of Al were taken as 10 at l 500 nm, one would obtain lsp < 13 m, still much longer than the wavelength.

Another way to excite surface plasmon or phonon polaritons is by gratings. When light is incident onto a grating at a given kx, the Bloch-Floquet condition given by Eq. (9.63a) in Sec. 9.4 states that the reflected and refracted waves can have different values of the parallel component of the wavevector: kx,j kx 2pj/, where j is the diffraction order and is the period of the gratings. For this reason, the dispersion relation can be folded into the region for kx p/ and surface polaritons can be excited on a grating surface. As an example, Fig. 10.15a shows the reduced dispersion relation for a binary grating made of Ag with

2 104

Reflectance, R'l

Wavenumber (cm−1)

1.5 104 Folded dispersion lines 1 104

TM, q = 0° and 30°

1

(a)

q = 30°

0.8 0.6 0.4

q

5 103

0

0.2

Dispersion line

0 100 3

1.5 10

3

3 10

4.5 10

kx /(2p) (cm−1)

3

3

6 10

(b) 0 3000

TM, q = 30°

TM, q = 0°

Ag 6000

9000

Λ = 1.7 µm d = 0.1 µm Filling ratio f = 0.65 12000

Wavenumber (cm−1)

FIGURE 10.15 (a) Dispersion curves for gratings. (b) Reflectance for an Ag grating.

15000

18000

CHAPTER 10

400

1.7 m. The solid lines are the folded dispersion curves, and the dash-dotted lines, which are also folded, correspond to an incidence angle of 30. The intersections identify the location where surface plasmons can be excited for a TM wave incidence, when the magnetic field is parallel to the grooves. The reflectance of a shallow grating on Ag is calculated and plotted in Fig. 10.15b at u 0 and 30. The grating height d 100 nm, and the filling ratio f 0.65 (see Fig. 9.18 for the grating geometry). For a TE wave, no drops exist in the reflectance because surface waves cannot be excited. The reflectance is very high for TE waves and has little difference between u 0 and u 30. For a TM wave, the excitation of surface polaritons is responsible for the dips in the reflectance. Furthermore, the frequency locations agree well with those predicted by the dispersion curves. Note that at normal incidence, the excitation frequencies are located at the intersections between the dispersion curve and the vertical axis, as shown in Fig. 10.15a. These dips have also been known as Wood’s or the Rayleigh-Wood anomalies, when a diffraction order just appears at the grazing angle; see Hessel and Oliner (Appl. Opt., 4, 1275, 1965). The actual resonance frequency may shift slightly from the frequency associated with the appearance or disappearance of a diffraction order, because the dispersion curve is not a straight line. The Rayleigh-Wood anomaly may also occur for gratings whose dielectric functions have a positive real part, i.e., not associated with surface plasmon polaritons. It should be mentioned that many polar dielectric or semiconductor materials such as MgO, SiC, and GaAs contain a phonon absorption band, called the reststrahlen band, where Re(e) is negative and Im(e) is very small. The surface polariton condition described in Eq. (10.42) can be satisfied in the infrared, and the associated excitation or resonance is called a surface phonon polariton. In the following discussion of polaritons, the word “metal” is used to signify a material with a negative real permittivity or a negative-e material. Surface roughness is yet another way to excite surface waves because a rough surface can be considered as a Fourier expansion of multiple periodic components, each acting as a grating. Obviously, there is a large room to tune the radiative properties by surface polaritons with different geometries. The resonance behavior in nanoparticles or quantum dots has enormous applications in chemical sensing and medical diagnoses. Plasmon waveguide which is based on the resonance of nanoparticles, nanowires, and nanotips may allow electromagnetic energy transfer beyond the diffraction limit; see, for example, Maier et al. (Nature Mater., 2, 229, 2003), Dickson and Lyon (J. Phys. Chem. B, 104, 6095, 2000), and Stockman (Phys. Rev. Lett., 93, 137404, 2004). Mie in 1908 developed the formula to describe scattering from small absorbing particles, and expressed the scattering coefficient and the absorption coefficient in the limit of a small sphere, whose radius r0 is much smaller than the wavelength in vacuum l, as

and

Qsca,l

8 2pr0 4 2 2 e2 e1 2 2 b e1 a 3 l e2 2e1

(10.48)

Qabs,l

8pr0 e2 e1 b 2e1Ima l e2 2e1

(10.49)

where e1 is the dielectric function of the surrounding dielectric medium and e2 is that of the absorbing sphere.32 While Eq. (10.48) has the same form as the expression of Rayleigh scattering with the 1/l4 relationship of the scattering cross section, defined as 4pr20Qsca,l, the scattering of metallic spheres is distinctly different from that of dielectric spheres because the dielectric function of metals is complex and depends strongly on the wavelength. The scattering cross section is usually a very complex function of the wavelength. This is especially true when the resonance condition e2 2e1 is satisfied. This resonance

NEAR-FIELD ENERGY TRANSFER

401

is associated with the localized surface plasmon polaritons. Geometric optics completely failed to describe scattering and absorption of small particles. The scattering cross section can be much greater than the actual surface area. Furthermore, the absorbed energy can exceed that of a blackbody of the same size. In fact, the blackbody concept is misleading in the subwavelength regime. The actual resonance condition may be complicated for different geometries and coatings, as well as for clusters of particles or nanoparticle aggregates. Detailed discussion about resonance in metallic and polar dielectric materials in the absorption band can be found from Bohren and Huffman;32 also see Yang et al. (J. Cem. Phys., 102, 869, 1995), Link et al. (J. Phys. Chem. B, 103, 3073, 1999), Jin et al. (Science, 294, 1901, 2001), and Kottmann et al. (Phys. Rev. B, 64, 235402, 2001). Resonance phenomena in small particles have been applied to surface-enhanced Raman scattering microscopy and surface-enhanced florescence microscopy for single-molecule detection. The study of resonance phenomena in small particles continues to be an active research area because of the applications in biological imaging and molecular sensing; for details, refer to Moskovits (Rev. Mod. Phys., 57, 783, 1985), Chen et al. (Nano Lett., 5, 473, 2005), Johansson et al. (Phys. Rev. B, 72, 035427, 2005), and Pustovit and Shahbazyan (Phys. Rev. B, 73, 085408, 2006). Surface wave scattering has been used as a technique to characterize metallic nanoparticles.33

10.2.2 Coupled Surface Polaritons and Bulk Polaritons Polaritons can exist on both surfaces of a thin film, resulting in a standing wave inside the film, as shown in Fig. 10.16. Economou performed a detailed investigation of different

e1, m1

2

1

3

Medium 1

e2, m2

Medium 2

e3, m3

Medium 3

d

FIGURE 10.16 Illustration of polaritons in a slab. 1—symmetric mode coupled surface polaritons; 2—antisymmetric mode coupled surface polaritons; and 3—bulk polariton.

configurations of a thin-film structure;34 while Kovacs and Scott (Phys. Rev. B, 16, 1297, 1977) studied the optical excitation of surface plasma waves in layered structures. An essential requirement for coupled surface polaritons to occur is the existence of evanescent waves that decay in both media 1 and 3. Such a method was used in Sec. 10.1.2 for obtaining the mode equation for waveguides. A more convenient method to derive the polariton relations is to set the denominator of the reflection coefficient to zero. From Eq. (10.46), we can see that for the configuration shown in Fig. 10.16, r (r12 r23e2ik2zd)/(1 r12r23e2ik2zd), which has poles at 1 r12r23e2ik2zd 0. This can be expressed as follows: tanh(ik2zd)a

k22z e22

k1zk3z k2z k1z k3z

e e b e ae e b 1 3 2 1 3

(10.50)

which is the polariton dispersion relation for a slab sandwiched between two semi-infinite media. Because tanh(ik2zd) i tan(k2zd), Eq. (10.50) is identical to the mode equation of

CHAPTER 10

402

a planar waveguide given in Eq. (10.16), when medium 3 i

ABOUT THE AUTHOR Zhuomin M. Zhang has taught at the University of Florida (1995–2002) and the Georgia Institute of Technology (since 2002). Professor Zhang is an ASME Fellow and has done cutting-edge research in the areas of micro/nanoscale heat transfer, with applications to optoelectronic devices and semiconductor manufacturing. He is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE), the ASME Heat Transfer Division Best Paper Award, and the AIAA Thermophysics Best Paper Award. Professor Zhang currently serves on the Editorial Board of the International Journal of Thermophysics and is an associate editor for the Journal of Quantitative Spectroscopy & Radiative Transfer and the Journal of Thermophysics and Heat Transfer.

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

NANO/MICROSCALE HEAT TRANSFER Zhuomin M. Zhang Georgia Institute of Technology Atlanta, Georgia

New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto

Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Manufactured in the United States of America. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher. 0-07-150973-9 The material in this eBook also appears in the print version of this title: 0-07-143674-X. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. For more information, please contact George Hoare, Special Sales, at [email protected] or (212) 904-4069. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise. DOI: 10.1036/007143674X

To my wife Lingyun

This page intentionally left blank

For more information about this title, click here

CONTENTS

Preface xiii List of Symbols

xvii

Chapter 1. Introduction

1

1.1 Limitations of the Macroscopic Formulation / 2 1.2 The Length Scales / 3 1.3 From Ancient Philosophy to Contemporary Technologies / 5 1.3.1 Microelectronics and Information Technology / 6 1.3.2 Lasers, Optoelectronics, and Nanophotonics / 8 1.3.3 Microfabrication and Nanofabrication / 10 1.3.4 Probing and Manipulation of Small Structures / 12 1.3.5 Energy Conversion Devices / 15 1.3.6 Biomolecule Imaging and Molecular Electronics / 17 1.4 Objectives and Organization of This Book / 19 References / 22

Chapter 2. Overview of Macroscopic Thermal Sciences

25

2.1 Fundamentals of Thermodynamics / 25 2.1.1 The First Law of Thermodynamics / 26 2.1.2 Thermodynamic Equilibrium and the Second Law / 27 2.1.3 The Third Law of Thermodynamics / 31 2.2 Thermodynamic Functions and Properties / 32 2.2.1 Thermodynamic Relations / 32 2.2.2 The Gibbs Phase Rule / 34 2.2.3 Specific Heats / 36 2.3 Ideal Gas and Ideal Incompressible Models / 38 2.3.1 The Ideal Gas / 38 2.3.2 Incompressible Solids and Liquids / 40 2.4 Heat Transfer Basics / 41 2.4.1 Conduction / 42 2.4.2 Convection / 44 2.4.3 Radiation / 46 2.5 Summary / 51 References / 51 Problems / 52

Chapter 3. Elements of Statistical Thermodynamics and Quantum Theory 3.1 Statistical Mechanics of Independent Particles / 58 3.1.1 Macrostates versus Microstates / 59 3.1.2 Phase Space / 59 vii

57

viii

3.2

3.3

3.4 3.5

3.6 3.7 3.8

CONTENTS

3.1.3 Quantum Mechanics Considerations / 60 3.1.4 Equilibrium Distributions for Different Statistics / 62 Thermodynamic Relations / 67 3.2.1 Heat and Work / 67 3.2.2 Entropy / 67 3.2.3 The Lagrangian Multipliers / 68 3.2.4 Entropy at Absolute Zero Temperature / 68 3.2.5 Macroscopic Properties in Terms of the Partition Function / 69 Ideal Molecular Gases / 71 3.3.1 Monatomic Ideal Gases / 71 3.3.2 Maxwell’s Velocity Distribution / 73 3.3.3 Diatomic and Polyatomic Ideal Gases / 75 Statistical Ensembles and Fluctuations / 81 Basic Quantum Mechanics / 82 3.5.1 The Schrödinger Equation / 82 3.5.2 A Particle in a Potential Well or a Box / 84 3.5.3 A Rigid Rotor / 86 3.5.4 Atomic Emission and the Bohr Radius / 88 3.5.5 A Harmonic Oscillator / 90 Emission and Absorption of Photons by Molecules or Atoms / 92 Energy, Mass, and Momentum in Terms of Relativity / 94 Summary / 96 References / 96 Problems / 96

Chapter 4. Kinetic Theory and Micro/Nanofluidics

101

4.1 Kinetic Description of Dilute Gases / 101 4.1.1 Local Average and Flux / 102 4.1.2 The Mean Free Path / 105 4.2 Transport Equations and Properties of Ideal Gases / 108 4.2.1 Shear Force and Viscosity / 109 4.2.2 Heat Diffusion / 110 4.2.3 Mass Diffusion / 112 4.2.4 Intermolecular Forces / 115 4.3 The Boltzmann Transport Equation / 116 4.3.1 Hydrodynamic Equations / 117 4.3.2 Fourier’s Law and Thermal Conductivity / 119 4.4 Micro/Nanofluidics and Heat Transfer / 121 4.4.1 The Knudsen Number and Flow Regimes / 122 4.4.2 Velocity Slip and Temperature Jump / 124 4.4.3 Gas Conduction—From the Continuum to the Free Molecule Regime / 129 4.5 Summary / 132 References / 132 Problems / 133

Chapter 5. Thermal Properties of Solids and the Size Effect 5.1 Specific Heat of Solids / 137 5.1.1 Lattice Vibration in Solids: The Phonon Gas / 137 5.1.2 The Debye Specific Heat Model / 139 5.1.3 Free Electron Gas in Metals / 143 5.2 Quantum Size Effect on the Specific Heat / 148 5.2.1 Periodic Boundary Conditions / 148 5.2.2 General Expressions of Lattice Specific Heat / 149 5.2.3 Dimensionality / 149 5.2.4 Thin Films Including Quantum Wells / 151 5.2.5 Nanocrystals and Carbon Nanotubes / 153

137

CONTENTS

ix

5.3 Electrical and Thermal Conductivities of Solids / 154 5.3.1 Electrical Conductivity / 155 5.3.2 Thermal Conductivity of Metals / 158 5.3.3 Derivation of Conductivities from the BTE / 160 5.3.4 Thermal Conductivity of Insulators / 162 5.4 Thermoelectricity / 166 5.4.1 The Seebeck Effect and Thermoelectric Power / 167 5.4.2 The Peltier Effect and the Thomson Effect / 168 5.4.3 Thermoelectric Generation and Refrigeration / 170 5.4.4 Onsager’s Theorem and Irreversible Thermodynamics / 172 5.5 Classical Size Effect on Conductivities and Quantum Conductance / 174 5.5.1 Classical Size Effect Based on Geometric Consideration / 174 5.5.2 Classical Size Effect Based on the BTE / 178 5.5.3 Quantum Conductance / 182 5.6 Summary / 187 References / 187 Problems / 190

Chapter 6. Electron and Phonon Transport

193

6.1 The Hall Effect / 193 6.2 General Classifications of Solids / 195 6.2.1 Electrons in Atoms / 195 6.2.2 Insulators, Conductors, and Semiconductors / 197 6.2.3 Atomic Binding in Solids / 199 6.3 Crystal Structures / 201 6.3.1 The Bravais Lattices / 201 6.3.2 Primitive Vectors and the Primitive Unit Cell / 204 6.3.3 Basis Made of Two or More Atoms / 206 6.4 Electronic Band Structures / 209 6.4.1 Reciprocal Lattices and the First Brillouin Zone / 209 6.4.2 Bloch’s Theorem / 210 6.4.3 Band Structures of Metals and Semiconductors / 214 6.5 Phonon Dispersion and Scattering / 217 6.5.1 The 1-D Diatomic Chain / 217 6.5.2 Dispersion Relations for Real Crystals / 219 6.5.3 Phonon Scattering / 221 6.6 Electron Emission and Tunneling / 226 6.6.1 Photoelectric Effect / 226 6.6.2 Thermionic Emission / 227 6.6.3 Field Emission and Electron Tunneling / 229 6.7 Electrical Transport in Semiconductor Devices / 232 6.7.1 Number Density, Mobility, and the Hall Effect / 232 6.7.2 Generation and Recombination / 236 6.7.3 The p-n Junction / 238 6.7.4 Optoelectronic Applications / 240 6.8 Summary / 242 References / 242 Problems / 244

Chapter 7. Nonequilibrium Energy Transfer in Nanostructures 7.1 Phenomenological Theories / 248 7.1.1 Hyperbolic Heat Equation / 250 7.1.2 Dual-Phase-Lag Model / 254 7.1.3 Two-Temperature Model / 258

247

x

CONTENTS

7.2 Heat Conduction Across Layered Structures / 262 7.2.1 Equation of Phonon Radiative Transfer (EPRT) / 263 7.2.2 Solution of the EPRT / 266 7.2.3 Thermal Boundary Resistance (TBR) / 271 7.3 Heat Conduction Regimes / 275 7.4 Summary / 278 References / 278 Problems / 281

Chapter 8. Fundamentals of Thermal Radiation

283

8.1 Electromagnetic Waves / 285 8.1.1 Maxwell’s Equations / 285 8.1.2 The Wave Equation / 286 8.1.3 Polarization / 288 8.1.4 Energy Flux and Density / 290 8.1.5 Dielectric Function / 291 8.1.6 Propagating and Evanescent Waves / 293 8.2 Blackbody Radiation: The Photon Gas / 294 8.2.1 Planck’s Law / 294 8.2.2 Radiation Thermometry / 298 8.2.3 Entropy and Radiation Pressure / 301 8.2.4 Limitations of Planck’s Law / 305 8.3 Radiative Properties of Semi-Infinite Media / 306 8.3.1 Reflection and Refraction of a Plane Wave / 306 8.3.2 Emissivity / 311 8.3.3 Bidirectional Reflectance / 312 8.4 Dielectric Function Models / 314 8.4.1 Kramers-Kronig Dispersion Relations / 314 8.4.2 The Drude Model for Free Carriers / 315 8.4.3 The Lorentz Oscillator Model for Lattice Absorption / 318 8.4.4 Semiconductors / 321 8.4.5 Superconductors / 325 8.4.6 Metamaterials with a Magnetic Response / 326 8.5 Summary / 329 References / 329 Problems / 330

Chapter 9. Radiative Properties of Nanomaterials 9.1 Radiative Properties of a Single Layer / 333 9.1.1 The Ray Tracing Method for a Thick Layer / 334 9.1.2 Thin Films / 335 9.1.3 Partial Coherence / 340 9.1.4 Effect of Surface Scattering / 344 9.2 Radiative Properties of Multilayer Structures / 346 9.2.1 Thin Films with Two or Three Layers / 347 9.2.2 The Matrix Formulation / 348 9.2.3 Radiative Properties of Thin Films on a Thick Substrate / 350 9.2.4 Local Energy Density and Absorption Distribution / 352 9.3 Photonic Crystals / 352 9.4 Periodic Gratings / 356 9.4.1 Rigorous Coupled-Wave Analysis (RCWA) / 358 9.4.2 Effective Medium Formulations / 360 9.5 Bidirectional Reflectance Distribution Function (BRDF) / 362 9.5.1 The Analytical Model / 363 9.5.2 The Monte Carlo Method / 364

333

CONTENTS

xi

9.5.3 Surface Characterization / 367 9.5.4 BRDF Measurements / 368 9.5.5 Comparison of Modeling with Measurements / 370 9.6 Summary / 372 References / 373 Problems / 374

Chapter 10. Near-Field Energy Transfer

377

10.1 Total Internal Reflection, Guided Waves, and Photon Tunneling / 378 10.1.1 The Goos-Hänchen Shift / 379 10.1.2 Waveguides and Optical Fibers / 382 10.1.3 Photon Tunneling by Coupled Evanescent Waves / 386 10.1.4 Thermal Energy Transfer between Closely Spaced Dielectrics / 389 10.1.5 Resonance Tunneling through Periodic Dielectric Layers / 391 10.1.6 Photon Tunneling with Negative Index Materials / 393 10.2 Polaritons or Electromagnetic Surface Waves / 395 10.2.1 Surface Plasmon and Phonon Polaritons / 396 10.2.2 Coupled Surface Polaritons and Bulk Polaritons / 401 10.2.3 Polariton-Enhanced Transmission of Layered Structures / 405 10.2.4 Radiation Transmission through Nanostructures / 408 10.2.5 Superlens for Perfect Imaging and the Energy Streamlines / 410 10.3 Spectral and Directional Control of Thermal Radiation / 414 10.3.1 Gratings and Microcavities / 417 10.3.2 Metamaterials / 421 10.3.3 Modified Photonic Crystals for Coherent Thermal Emission / 422 10.4 Radiation Heat Transfer at Nanometer Distances / 425 10.4.1 The Fluctuational Electrodynamics / 426 10.4.2 Heat Transfer between Parallel Plates / 428 10.4.3 Asymptotic Formulation / 430 10.4.4 Nanoscale Radiation Heat Transfer between Doped Silicon / 431 10.5 Summary / 436 References / 437 Problems / 440

Appendix A. Physical Constants, Conversion Factors, and SI Prefixes

443

Physical Constants / 443 Conversion Factors / 443 SI Prefixes / 443

Appendix B. Mathematical Background B.1 Some Useful Formulae / 445 B.1.1 Series and Integrals / 445 B.1.2 The Error Function / 446 B.1.3 Stirling’s Formula / 447 B.2 The Method of Lagrange Multipliers / 447 B.3 Permutation and Combination / 448 B.4 Events and Probabilities / 450 B.5 Distribution Functions and the Probability Density Function / 451 B.6 Complex Variables / 454 B.7 The Plane Wave Solution / 455 B.8 The Sommerfeld Expansion / 459 Index

461

445

This page intentionally left blank

PREFACE

Over the past 20 years, there have been tremendous developments in microelectronics, microfabrication technology, MEMS and NEMS, quantum structures (e.g., superlattices, nanowires, nanotubes, and nanoparticles), optoelectronics and lasers, including ultrafast lasers, and molecular- to atomic-level imaging techniques (such as high-resolution electron microscopy, scanning tunneling microscopy, atomic force microscopy, near-field optical microscopy, and scanning thermal microscopy). The field is fast moving into scaling up and systems engineering to explore the unlimited potential that nanoscience and nanoengineering may offer to restructure the technologies in the new millennia. When the characteristic length becomes comparable to the mechanistic length scale, continuum assumptions that are often made in conventional thermal analyses may break down. Similarly, when the characteristic time becomes comparable to the mechanistic timescale, traditional equilibrium approaches may not be appropriate. Understanding the energy transport mechanisms in small dimensions and short timescales is crucial for future advances of nanotechnology. In recent years, a growing number of research publications have been in nano/microscale thermophysical engineering. Timely dissemination of the knowledge gained from contemporary research to educate future scientists and engineers is of emerging significance. For this reason, more and more universities have started to offer courses in microscale areas. A self-contained textbook suitable for engineering students is much needed. Many practicing engineers who have graduated earlier wish to learn what is going on in this fascinating area, but are often frustrated due to the lack of a solid background to comprehend the contemporary literature. A book that does not require prior knowledge in statistical mechanics, quantum mechanics, solid state physics, and electrodynamics is extremely helpful. On the other hand, such a book should cover all these subjects in some depth without significant prerequisites. This book is written for engineering senior undergraduate and graduate students, practicing engineers, and academic researchers who have not been extensively exposed to nanoscale sciences but wish to gain a solid background in the thermal phenomena occurring at small length scales and short timescales. The basic philosophy behind this book is to logically integrate the traditional knowledge in thermal engineering and physics with newly developed theories in an easy-to-understand approach, with ample examples and homework problems. The materials have been used in the graduate course and undergraduate elective that I have taught a number of times at two universities since 1999. While this book can be used as a text for a senior elective or an entry-level graduate course, it is not expected that all the materials will be covered in a one-semester course. The instructors should have the freedom to select materials from the book according to students’ backgrounds and interests. Some chapters and sections can also be used to integrate with traditional thermal science courses in order to update the current undergraduate and graduate curricula with nanotechnology contents. The content of this book includes microscopic descriptions and approaches, as well as their applications in thermal science and engineering, with an emphasis on energy transport in gases and solids by conduction (diffusion) and radiation (with or without a medium), as well as convection in micro/nanofluidics. Following the introduction in xiii

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

xiv

PREFACE

Chapter 1, an in-depth overview on the foundation of macroscopic thermodynamics, heat transfer, and fluid mechanics is given in Chapter 2. Chapter 3 summarizes the well-established theories in statistical mechanics, including classical and quantum statistics; thermal properties of ideal gases are described in the context of statistical thermodynamics, followed by a concise presentation of quantum mechanics. Chapter 4 focuses on microfluidics and introduces the Boltzmann transport equation. The heat transfer and microflow regimes from continuous flow to free molecule flow are described. In Chapters 5, 6, and 7, heat transfer in solid nanostructures is extensively discussed. Chapter 5 presents the classical and quantum size effects on specific heat and thermal conductivity without involving detailed solid state theories, which are introduced in Chapter 6. This arrangement allows a more intuitive learning experience. Chapter 7 focuses on transient as well as nonequilibrium energy transport processes in nanostructures. The next three chapters deal with thermal radiation at nanoscales. Chapter 8 provides the fundamental understanding of electromagnetic waves and the dielectric properties of various materials. The concept of radiation entropy is also introduced, along with the recently demonstrated metamaterials with exotic properties. Chapter 9 describes interference effects of thin films and multilayers, the band structure of photonic crystals, diffraction from surface-relief gratings, and scattering from rough surfaces. Chapter 10 explores the evanescent wave and the coupling phenomena in the near field for energy transfer. Recent advances in nanophotonics and nanoscale radiation heat transfer are also summarized. The dual nature of particles and waves are emphasized throughout the book in explaining the energy carriers, such as molecules in ideal gases, electrons in metals, phonons in dielectric crystalline materials, and photons for radiative transfer. In the early 1990s, I was fortunate to work with Professor Markus Flik for my Ph.D. dissertation on the infrared spectroscopy of thin (down to 10 nanometers) high-temperature superconducting films for microfabricated, highly sensitive radiation detectors, as well as to assist him in the summer short course on microscale heat transfer at the Massachusetts Institute of Technology. While I was still a postdoctoral researcher, late Professor Chang-Lin Tien, then Chancellor of the University of California at Berkeley, wrote an invitation letter to me to give a seminar in the Department of Mechanical Engineering of Berkeley in January 1994; he continuously supported me, including the development of the concept of this book. The last time I heard from him was just a few weeks prior to the 2000 National Heat Transfer Conference in Pittsburgh, where he delivered a plenary speech before he fell ill. In his letter dated August 10, 2000, Professor Tien enthusiastically endorsed my plan to write a microscale textbook and encouraged me to include nano aspects. He wrote “I would like to express to you my strongest support for your project; however, I would suggest that you broaden the content somewhat beginning with the title to ‘Micro/Nanoscale Heat ...,’ and to talk about some coverage on nano aspects.” Professor Tien opened my eyes, and it took me several years afterward to complete this book, which now has more emphasis on nanoscale thermal sciences and engineering. I also benefited greatly from the encouragements and comments received through discussions with a large number of people in the heat transfer and thermophysics community, too many to be listed here. I am grateful to my colleagues and friends at both University of Florida (UF) and Georgia Tech for their help whenever needed. I especially want to thank Professor William Tiederman, who was Chair of the Department of Mechanical Engineering during my stay at UF, for his support and mentorship at the early stage of my independent research and teaching career. Professor David Tanner in the Department of Physics of UF helped me understand solid state physics; I have enjoyed collaboration with him since 1995. Through the years, Dr. Jack Hsia, former Chief of Academic Affairs at the National Institute of Standards and Technology (NIST), offered me much personal and professional advice. He is one of the many outstanding mentors I have had from NIST, where I gained my postdoctoral experience and worked for a number of summers afterward. This book

PREFACE

xv

would not have been possible without my graduate students’ hard work and dedication. Most of them have taken my classes and proofread different versions of the manuscript. Some materials in the last few chapters of the book were generated based on their thesis research. Many graduate and undergraduate students who have taken my classes or worked in the Nanoscale Thermal Radiation Lab also provided constructive suggestions. I enjoyed working with all of them. I must thank the Sponsoring Editor, Ken McCombs, for his endurance and persistence that kept me on the writing track over the past few years, and the whole production team, for carefully editing the manuscript and setting the final pages. While this project was partially supported by the National Science Foundation as part of my educational plan in the CAREER/PECASE grant, I take full responsibility for any inadvertent errors or mistakes. Finally, I thank my family for their understanding and support throughout the writing journey. My three children, Emmy, Angie, and Bryan, have given me great happiness and made my life meaningful. This book is dedicated to my wife Lingyun for the unconditional love and selfless care she has provided to me and to our children. ZHUOMIN M. ZHANG

This page intentionally left blank

LIST OF SYMBOLS

A Ac Al a a a0 al B C c c0 cv or –c v cp or –c p D D DAB d E E EF Eg e eb F, F F f G G g– g H H h hm –h – h I

area, m2; Helmholtz free energy, J cross-sectional area, m2 directional-spectral absorptance of a semitransparent material acceleration, m/s2 lattice constant, m; magnitude of acceleration, m/s2 Bohr radius, 0.0529 nm absorption coefficient, m1 magnetic induction or magnetic flux density, T (tesla) or Wb/m2 volumetric heat capacity (rcp), J/(m3 K) phase velocity of electromagnetic wave, m/s speed of light in vacuum, 2.998 108 m/s mass or molar specific heat for constant volume, J/(kg K) or J/(kmol K) mass or molar specific heat for constant pressure, J/(kg K) or J/(kmol K) dynamical matrix; electric displacement, C/m2 density of states, m3; diameter, m binary diffusion coefficient, m2/s diameter or film thickness, m electric field vector, N/C or V/m energy, J; magnitude of electric field, V/m Fermi energy, J bandgap energy, J electron charge (absolute value), 1.602 1019 C blackbody emissive power, W/m2 force, N normalized distribution function distribution function (sometimes normalized) reciprocal lattice vector, m1; dyadic Green function Gibbs free energy, J; electron-phonon coupling constant, W/(m3 K) molar specific Gibbs free energy, J/kmol degeneracy magnetic field vector, A/m or C/(m s) enthalpy, J; magnetic field strength, A/m or C/(m s) mass specific enthalpy, J/kg; convection heat transfer coefficient, W/(m2 K); Planck’s constant, 6.626 1034 J s convection mass transfer coefficient, m/s Planck’s constant divided by 2p, h/2p molar specific enthalpy, J/kmol unit matrix; unit dyadic

xvii

Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

xviii

LIST OF SYMBOLS

I

moment of inertia, kg m2; intensity or radiance, W/(m2 m sr); electric current, A

i i, j, k J or J J, Je JE Jm JN Jp K k k kB L L0 Ll l l, m, n M m mr m* m˙ N NA ND N˙ n

21 indices used in series flux vector or magnitude (quantity transferred per unit area per unit time) current density (also called electric charge flux), A/m2 energy flux, W/m2 mass flux, kg/(s m2) particle flux, m2 momentum flux, Pa (N/m2) spring constant, N/m; Thomson’s coefficient, V/K; Bloch wavevector, m1 wavevector, m1 magnitude of the wavevector, m1 Boltzmann’s constant, 1.381 1023 J/K characteristic length, m average distance between molecules or atoms, m radiation entropy intensity, W/(K m2 m sr) length, m index numbers molecular weight, kg/kmol mass of a system or a single particle, kg reduced mass, kg effective mass, kg mass flow rate or mass transfer rate, kg/s number of particles; number of phonon oscillators Avegadro’s constant, 6.022 1026 kmol1; acceptor concentration, m3 donor concentration, m3 particle flow rate, s1 number density, m3; quantum number; real part of refractive index or refractive index amount of substance, kmol complex refractive index propagation matrix; polarization vector or dipole moment per unit volume, C/m2 pressure, Pa (N/m2) momentum flux component, Pa – momentum vector (mv . or h k), kg m/s – momentum (mv or h k ), kg m/s; probability; specularity index numbers heat, J, quality factor heat transfer rate, W number of coexisting phases; number of atoms per molecule thermal energy generation rate, W/m3 heat flux vector, W/m2 heat flux, W/m2 gas constant, J/(kg K); electrical resistance, or V/A directional-hemispherical reflectance thermal boundary resistance, m2 K/W thermal resistance, m2 K/W universal gas constant, 8314.5 J/(kmol K)

n– n~ P P Pi j p p p,q Q . Q q q. q″ q″ R R′ Rb″ Rt″ – R

LIST OF SYMBOLS

r re ~ r S S Sj . S . S gen s or –s . sgen s″ T T′ t ~ t U ud u or u– V v vB vR v va vF vg vl, vt vp vx, vy, vz –v W x, y, z Z

distance or radius, m; Fresnel reflection coefficient electrical resistivity, m complex Fresnel reflection coefficient Poynting vector, W/m2 entropy, J/K strength of the jth phonon oscillator entropy transfer rate, W/K entropy generation rate, W/K specific entropy, J/(kg K), J/(m3 K) or J/(kmol K) volumetric entropy generation rate, W/(m3 K) entropy flux, J/(m2 K) temperature, K directional-hemispherical transmittance time, s; Fresnel transmission coefficient complex Fresnel transmission coefficient internal energy, J; periodic potential; J drift velocity, m/s specific internal energy: mass specific, J/kg, and volume specific (i.e., energy density), J/m3, or molar specific, J/kmol volume, m3; voltage, V velocity, m/s bulk or mean velocity, m/s random or thermal velocity, m/s specific volume, m3/kg; speed, m/s speed of sound or average speed of phonons, m/s Fermi velocity, m/s magnitude of group velocity (dv/dk), m/s longitudinal, transverse phonon speed, m/s phase speed (v/k), m/s velocity components, m/s molar specific volume, m3/kmol; average speed, m/s work, J; width, m coordinates, m partition function

DIMENSIONLESS PARAMETERS Kn Le Lz Ma Nu Pe Pr Re Sc ZT

xix

Knudsen number, /L Lewis number, DAB/a Pr/Sc Lorentz number, k/sT Mach number, v/va Nusselt number, hL/k Peclet number, RePr v L/a Prandtl number, n/a Reynolds number, rn L/m Schmidt number, n/DAB dimensionless figure of merit for thermoelectricity

LIST OF SYMBOLS

xx

GREEK SYMBOLS a a and b aT av av′ a¢l b bP bT bv ij S g gs d dl e e~ e¢l hH Θ ΘD u uB uc k kT Λ Λa l m mF n n Π r re r¢ s srms sSB s¢SB

thermal diffusivity, m2/s; other constant Lagrangian multipliers thermal accommodation coefficient (tangential) momentum accommodation coefficient normal momentum accommodation coefficient directional-spectral absorptivity phase shift, rad; various coefficients isobaric thermal expansion coefficient, K1 2g(2 aT)Kn/[aT (g 1)Pr] (2 av)Kn/av hemispherical transmissivity for phonons from medium i to j Seebeck’s coefficient, V/K specific heat ratio (cp /cv); scattering rate (1/t), rad/s Sommerfeld constant, J/(kg K2) differential small quantity; boundary layer thickness, m radiation penetration depth, m particle energy, J; electric permittivity, C2/(N m2); ratio of permittivity to that of vacuum; emissivity complex dielectric function, i.e., ratio of permittivity to that of vacuum directional-spectral emissivity Hall coefficient Ey /JxB, m3/C characteristic temperature, K Debye temperature, K zenith angle, rad Brewster’s angle, rad critical angle, rad thermal conductivity, W/(m K); extinction coefficient (i.e., imaginary part of the refractive index) isothermal compressibility, Pa1 mean free path, m; period of a grating or photonic crystal, m average distance between collisions, m wavelength, m viscosity, N s/m2; chemical potential, J; electron or hole mobility, m2/(V s), magnetic permeability, N/A2; ratio of the permeability to that of vacuum Fermi energy, J kinematic viscosity, m2/s; frequency, Hz wavenumber, cm1 Peltier’s coefficient, V density, kg/m3 charge density, C/m3 directional-hemispherical reflectivity electrical conductivity, (Ω m)1; standard deviation root-mean-square surface roughness, m Stefan-Boltzmann constant, 5.67 108 W/(m2 K4) phonon Stefan-Boltzmann constant, W/(m2 K4)

LIST OF SYMBOLS

t t¢ t12 Φ f Ψ c Ω v vp v

relaxation time, s; shear stress, Pa directional-hemispherical transmissivity transmission coefficient scattering phase function; viscous dissipation function; potential function number of degrees of freedom; azimuthal angle, rad; intermolecular potential Schrödinger’s wavefunction; various functions molecular quantity; wavefunction; work function, J solid angle, sr; thermodynamic probability angular frequency, rad/s plasma frequency, rad/s velocity space, dv dvxdvydvz

SUBSCRIPTS 0 1, 2, 3 b d e h i i, j, k, l, m, n m mp n or p p r s t th v w l, n, or v

vacuum medium 1, 2, 3 blackbody; boundary defect or impurity electron hole incident indices bulk or mean; maximum; medium most probable n-type or p-type semiconductor TM wave or p (parallel) polarization reflected; rotational TE wave or s (perpendicular) polarization; scattered; surface; solid; lattice transmitted; translational thermal vibrational wall free steam spectral property in terms of wavelength, frequency, or angular frequency

xxi

This page intentionally left blank

NANO/MICROSCALE HEAT TRANSFER

This page intentionally left blank

CHAPTER 1

INTRODUCTION

Improvement in performance and shrinkage of device sizes in microelectronics have been major driving forces for scientific and economic progress over the past 30 years. Developments in semiconductor processing and surface sciences have allowed precise control over critical dimensions with desirable properties for solid state devices. In the past 20 years, there have been tremendous developments in micro- and nanoelectromechanical systems (MEMS and NEMS), microfluidics and nanofluidics, quantum structures and devices, photonics and optoelectronics, nanomaterials for molecular sensing and biomedical diagnosis, and scanning probe microscopy for measurement and manipulation at the molecular and atomic levels. Nanotechnology research has not only emerged as a new area of science and engineering, but it has also become an integral part of almost all natural science and engineering disciplines. According to the Web site of Georgia Institute of Technology (www.gatech.edu), more than 10% of the faculty members at the university have been involved with research projects related to nanoscience and nanoengineering. The same can be said for most major research universities in the United States and in many other countries. Furthermore, the study of nanoscience and nanoengineering requires and has resulted in close interactions across the boundaries of many traditional disciplines. Knowledge of physical behavior at the molecular and atomic levels has played and will continue to play an important role in our understanding of the fundamental processes occurring in the macro world. This will enable us to design and develop novel devices and machines, ranging from a few nanometers all the way to the size of automobiles and airplanes. We have already enjoyed camera phones and the iPod that can store thousands of pictures and songs. In the next few decades, the advancement of nano/microscale science and engineering will fundamentally restructure the technologies currently used in manufacturing, energy production and utilization, communication, transportation, space exploration, and medicine.1,2 A key issue associated with miniaturization is the tremendous increase in the heat dissipation per unit volume. Micro/nanostructures may enable engineered materials with unique thermal properties to allow significant enhancement or reduction of the heat flow rate. Therefore, knowledge of thermal transport from the micrometer scale down to the nanometer scale and thermal properties of micro/nanostructures is of critical importance to future technological growth. Solutions to more and more problems in small devices and systems require a solid understanding of the heat (or more generally, energy) transfer mechanisms in reduced dimensions and/or short time scales, because classical equilibrium and continuum assumptions are not valid anymore. Examples are the thermal analysis and modeling of micro/nanodevices, ultrafast laser interaction with materials, micromachined thermal sensors and actuators, thermoelectricity in nanostructures, photonic crystals, microscale thermophotovoltaic devices, and so forth.3,4

1 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

2

CHAPTER 1

1.1 LIMITATIONS OF THE MACROSCOPIC FORMULATION As an ancient Chinese philosopher put it, suppose you take a foot-long wood stick and cut off half of it each day; you will never reach an end even after thousands of years, as shown in Fig. 1.1. Modern science has taught us that, at some stage, one would reach the molecular

FIGURE 1.1 The length of the wood stick: l1 1 ft in day 1, l2 1/2 ft in day 2, and ln 1/2n1 ft in day n.

level and even the atomic level, below which the physical and chemical properties are completely different from those of the original material. The wooden stick or slice would eventually become something else that is not distinguishable from the other constituents in the atmosphere. Basically, properties of materials at very small scales may be quite different from those of the corresponding bulk materials. Note that 1 nm (nanometer) is one-billionth of a meter. The diameter of a hydrogen atom H is on the order of 0.1 nm, and that of a hydrogen molecule H2 is approximately 0.3 nm. Using the formula ln 0.3048>2n1 m, where n is number of days, we find l30 5.7 1010 m (or 0.57 nm) after just a month, which is already near the diameter of a hydrogen atom. While atoms can still be divided with large and sophisticated facilities, our ability to observe, manipulate, and utilize them is very limited. On the other hand, most biological processes occur at the molecular level. Many novel physical phenomena happen at the length scale of a few nanometers and can be integrated into large systems. This is why the nanometer is a critical length scale for the realization of practically important new materials, structures, and phenomena. For example, carbon nanotubes with diameters ranging from 0.4 to 50 nm or so have dramatically different properties. Some researchers have shown that these nanotubes hold promise as the building block of nanoelectronics. Others have found that the thermal conductivity of single-walled carbon nanotubes at room temperature could be an order of magnitude higher than that of copper. Therefore, carbon nanotubes have been considered as a candidate material for applications that require a high heat flux. In conventional fluid mechanics and heat transfer, we treat the medium as a continuum, i.e., indefinitely divisible without changing its physical nature. All the intensive

INTRODUCTION

3

properties can be defined locally and continuously. For example, the local density is defined as r lim

dV S 0

dm dV

(1.1)

where dm is the mass enclosed within a volume element dV . When the characteristic dimension is comparable with or smaller than that of the mechanistic length—for example, the molecular mean free path, which is the average distance that a molecule travels between two collisions—the continuum assumption will break down. The density defined in Eq. (1.1) will depend on the size of the volume, dV , and will fluctuate with time even at macroscopic equilibrium. Noting that the mean free path of air at standard atmospheric conditions is about 70 nm, the continuum assumption is well justified for many engineering applications until the submicrometer regime or the nanoscale is reached. Nevertheless, if the pressure is very low, as in an evacuated chamber or at a high elevation, the mean free path can be very large; and thus, the continuum assumption may break down even at relatively large length scales. Within the macroscopic framework, we calculate the temperature distribution in a fluid or solid by assuming that the medium under consideration is not only a continuum but also at thermodynamic equilibrium everywhere. The latter condition is called the localequilibrium assumption, which is required because temperature can be defined only for stableequilibrium states. With extremely high temperature gradients at sufficiently small length scales and/or during very short periods of time, the assumption of local equilibrium may be inappropriate. An example is the interaction between short laser pulses and a material. Depending on the type of laser, the pulse duration or width can vary from a few tens of nanoseconds down to several femtoseconds (1 fs 1015 s). In the case of ultrafast laser interaction with metals, free electrons in the metal could gain energy quickly to arrive at an excited state corresponding to an effective temperature of several thousand kelvins, whereas the crystalline lattices remain near room temperature. After an elapse of time represented by the electron relaxation time, the excess energy of electrons will be transferred to phonons, which are energy quanta of lattice vibration, thereby causing a heating effect that raises the temperature or changes the phase of the material under irradiation. Additional mechanisms may affect the behavior of a system as the physical dimensions shrink or as the excitation and detection times are reduced. A scale-down of the theories developed from macroscopic observations often proves to be unsuitable for applications involving micro/nanoscale phenomena. Examples are reductions in the conductivity of thin films or thin wires due to boundary scattering (size effect), discontinuous velocity and temperature boundary conditions in microfluidics, wave interferences in thin films, and tunneling of electrons and photons through narrow gaps. In the quantum limit, the thermal conductance of a nanowire will reach a limiting value that is independent of the material that the nanowire is made of. At the nanoscale, the radiation heat transfer between two surfaces can exceed that calculated from the Stefan-Boltzmann law by several orders of magnitude. Another effect of miniaturization is that surface forces (such as shear forces) will scale down with L2, where L is the characteristic length, while volume forces (such as buoyancy) will scale down with L3. This will make surface forces predominant over volume forces at the microscale.

1.2 THE LENGTH SCALES It is instructive to compare the length scales of different phenomena and structures, especially against the wavelength of the electromagnetic spectrum. Figure 1.2 compares the wavelength ranges with some characteristic dimensions. One can see that MEMS generally produce micromachining capabilities from several millimeters down to a few micrometers.

CHAPTER 1

4

Currently, the smallest feature of integrated circuits is well below 100 nm. The layer thickness of thin films ranges from a few nanometers up to several micrometers. The wavelengths of the visible light are in the range from approximately 380 to 760 nm. On the other hand, thermal radiation covers a part of the ultraviolet, the entire visible and infrared, and a portion of the microwave region. The thickness of human hair is between 50 and 100 m, while the diameter of red blood cells is about 6 to 8 m. A typical optical microscope can magnify 100 times with a resolution of 200 to 300 nm, which is about half the wavelength and is limited due to the diffraction of light. Therefore, optical microscopy is commonly used to study micrometer-sized objects. On the other hand, atoms and molecules are on the order of 1 nm, which falls in the x-ray and electron-beam wavelength region. Therefore, xray and electron microscopes are typically used for determining crystal structures and defects, as well as for imaging nanostructures. The development of scanning probe microscopes (SPMs) and near-field scanning optical microscopes (NSOMs) in the 1980s enabled unprecedented capabilities for the visualization and manipulation of nanostructures, such as nanowires, nanotubes, nanocrystals, single molecules, individual atoms, and so forth, as will be discussed in Sec. 1.3.4. Figure 1.2 also shows that the mean free path of heat Wavelength range Radio waves

Microwave

Characteristic dimension 10 cm 1 cm

Thermal Radiation

1 mm Far-IR

Mid-IR Near-IR Visible Ultraviolet

100 µm 10 µm 1 µm 100 nm 10 nm

Deep UV 1 nm X-ray E-beam

0.1 nm

Semiconductor wafers Human hands and fingers Microelectronic chips

Thickness of wafers (substrates) Sand grains Thickness of human hair MEMS Pollen grains; blood cells Thickness of films or coatings Surface roughness; mean free paths Smallest feature of integrated circuits Nanoparticles, nanorods, nanowires Nanotubes, SPM tips Molecules Lattice constants Atoms

FIGURE 1.2 Characteristic length scales as compared with the wavelength of electromagnetic spectrum.

carriers (e.g., molecules in gases, electrons in metals, and phonons or lattice vibration in dielectric solids) often falls in the micrometer to nanometer scales, depending on the material, temperature, and type of carrier. This book is motivated by the need to understand the thermal phenomena and heat transfer processes in micro/nanosystems and at very short time scales for solving problems occurring in contemporary and future technologies. A brief historical retrospective is given in the next section on the development of modern science and technologies, with a focus

INTRODUCTION

5

on the recent technological advances leading to nanotechnology. The role of thermal engineering throughout this technological advancement is outlined.

1.3 FROM ANCIENT PHILOSOPHY TO CONTEMPORARY TECHNOLOGIES Understanding the fundamentals of the composition of all things in the universe, their movement in space and with time, and the interactions between one and another is a human curiosity and the inner drive that makes us different from other living beings on the earth. The ancient Chinese believed that everything was composed of the five elements: metal, wood, water, fire, and earth (or soil) that generate and overcome one another in certain order and time sequence. These simple beliefs were not merely used for fortune-telling but have helped the development of traditional Chinese medicine, music, military strategy, astronomy, and calendar. In ancient Greece, the four elements (fire, earth, air, and water) were considered as the realm wherein all things existed and whereof all things consisted. These classical element theories prevailed in several other countries in somewhat different versions for over 2000 years, until the establishment of modern atomic theory that began with John Dalton’s experiment on gases some 200 years ago. In 1811, Italian chemist Amedeo Avogadro introduced the concept of the molecule, which consists of stable systems or bound state of atoms. A molecule is the smallest particle that retains the chemical properties and composition of a pure substance. The first periodic table was developed by Russian chemist Dmitri Mendeleev in 1869. Although the original meaning of atom in Greek is “indivisible,” subatomic particles have since been discovered. For example, electrons as a subatomic particle were discovered in 1897 by J. J. Thomson, who won the 1906 Nobel Prize in Physics. An atom is known as the smallest unit of one of the 116 confirmed elements so far. The first industrial revolution began in the late eighteenth century and boosted the economy of western countries from manual labor to the machine age by the introduction of machine tools and textile manufacturing. Following the invention of the steam engine in the mid-nineteenth century, the second industrial revolution had an even bigger impact on human life through the development of steam-powered ships and trains, along with the internal combustion engines, and the generation of electrical power. Newtonian mechanics and classical thermodynamics have played an indispensable role in the industrial revolutions. The development of machinery and the understanding of the composition of matter have allowed unprecedented precision of experimental investigation of physical phenomena, leading to the establishment of modern physics in the early twentieth century. The nature of light has long been debated. At the turn of the eighteenth century, Isaac Newton formulated the corpuscular theory of light and observed with his prism experiment that sunlight is composed of different colors. In the early nineteenth century, the discovery of infrared and ultraviolet radiation and Young’s double-slit experiment confirmed Huygens’ wave theory, which was overshadowed by Newton’s corpuscular theory for some 100 years. With the establishment of Maxwell’s equations that fully describe the electromagnetic waves and Michelson’s interferometric experiment, the wave theory of radiation had been largely accepted by the end of the nineteenth century. While the wave theory was able to explain most of the observed phenomena, it could not explain thermal emission over a wide spectrum, nor was it able to explain the photoelectric effect. Max Planck in 1901 used the hypothesis of light, or radiation quanta, or oscillators, to successfully derive the blackbody spectral distribution function. In 1905, Albert Einstein explained the photoelectric effect based on the concept of radiation quanta. To knock out an electron from the metal surface, the energy of each incoming radiation quantum (h ) must be sufficiently large because one electron can absorb only one quantum. This explained why photoemission could not occur

6

CHAPTER 1

at frequencies below the threshold value, no matter how intense the incoming radiation might be. It appears that light is not indefinitely divisible but must exist in multiples of the smallest massless quanta, which are known as photons. In 1924, Louis de Broglie hypothesized that particles should also exhibit wavelike characteristics. With the electron diffraction experiment, it was found that electrons indeed can behave like waves with a wavelength inversely proportional to the momentum. Electron microscopy was based on the principle of electron diffraction. The wave-particle duality was essential to the establishment of quantum mechanics in the early twentieth century. Quantum mechanics describes the phenomena occurring in minute particles, structures, and their interaction with radiation, for which classical mechanics and electrodynamics are not applicable. The fundamental scientific understanding gained during the first half of the twentieth century has facilitated the development of contemporary technologies that have transformed from the industrial economy to the knowledge-based economy and from the machine age to the information age. The major technological advancements in the last half of the century are highlighted in the following sections.

1.3.1 Microelectronics and Information Technology In his master’s thesis at MIT published in 1940, Claude Shannon (1916–2001) used the Boolean algebra and showed how to use TRUE and FALSE to represent function of switches in electronic circuits. Digital computers were invented during the 1940s in several countries, including the IBM Mark I which is 2.4 m high and 16 m long. In 1948, while working at Bell Labs, Shannon published an article, “A Mathematical Theory of Communication,” which marked the beginning of modern communication and information technology.5 In that paper, he laid out the basic principles of underlying communication of information with two symbols, 1 and 0, and coined the term “bit” for a binary digit. His theory made it possible for digital storage and transmission of pictures, sounds, and so forth. In December 1947, three scientists invented the semiconductor point contact transistor at Bell Labs using germanium. The earlier computers and radios were based on bulky vacuum tubes that generated a huge amount of heat. The invention of the transistor by William Shockley, John Bardeen, and Walter Brattain was recognized through the Nobel Prize in Physics conferred on them in 1956. There had been intense research on semiconductor physics using the atomic theory and the mechanism of point contact for the fabrication of transistors to become possible. The invention of transistors ushered the information age with a whole new industry. In 1954, Gordon Teal at Texas Instruments built the first silicon transistor. The native oxide of silicon appeared to be particularly suitable as the electric insulator. In 1958, Jack Kilby (1923–2005) at Texas Instruments was able to cramp all the discrete components onto a silicon base and later onto one piece of germanium. He filed a patent application the next year on “Miniaturized Electronic Circuits,” where he described how to make integrated circuits and connect the passive components via gold wires. Working independently, Robert Noyce at Fairchild Electronics in California found aluminum to adhere well to both silicon and silicon oxide and filed a patent application in 1959 on “Semiconductor Deviceand-Lead Structure.” Kilby and Noyce are considered the coinventors of integrated circuits. Noyce was one of the founders of Intel and died in 1990. Kilby was awarded half of the Nobel Prize in Physics in 2000 “for his part in the invention of the integrated circuit.” (See http://nobelprize.org/nobel_prizes/). The other half was shared by Zhores Alferov and Herbert Kroemer for developing semiconductor heterogeneous structures used in optoelectronics, to be discussed in the next section. In 1965, around 60 transistors could be packed on a single silicon chip. Seeing the fast development and future potential of integrated circuits, Gordon Moore, a cofounder of

INTRODUCTION

7

Intel, made a famous prediction that the number and complexity of semiconductor devices would double every year.6 This is Moore’s law, well-known in the microelectronics industry. In the mid-1970s, the number of transistors on a chip increased from 60 to 5000. By 1985, the Intel 386 processor contained a quarter million transistors on a chip. In 2001, the Pentium 4 processor reached 42 million transistors. The number has now exceeded 1 billion per chip in 2006. When the device density is plotted against time in a log scale, the growth almost follows a straight line, suggesting that the packaging density has doubled approximately every 18 months.6 Reducing the device size and increasing the packaging density have several advantages. For example, the processor speed increases by reducing the distance between transistors. Furthermore, new performance features can be added into the chip to enhance the performance. The cost for the same performance also reduces. Advanced supercomputer systems have played a critical role in enabling modeling and understanding micro/nanoscale phenomena. The process is first to grow high-quality silicon crystals and then dice and polish into wafers. Devices are usually made on SiO2 layer that can be grown by heating the wafer to sufficiently high temperatures in a furnace with controlled oxygen partial pressure. The wafers are then patterned using photolithographic techniques combined with etching processes. Donors and acceptors are added to the wafer to form n- and p-type regions by ion implantation and then annealed in a thermal environment. Metals or heavily doped polycrystalline silicon are used as gates with proper coverage and patterns through lithography. A schematic of metal-oxide-semiconductor field-effect transistor (MOSFET) is shown in Fig. 1.3. Millions of transistors can be packed in 1-mm2 area with several layers

FIGURE 1.3 Schematic of a metal-oxide-semiconductor field-effect transistor (MOSFET).

through very-large-scale integration (VLSI) with the smallest features smaller than 100 nm. As mentioned earlier, managing heat dissipation is a challenge especially as the device dimension continues to shrink. Local heating or hot spots on the size of 10 nm can cause device failure. The principles governing the heat transfer at the nanoscale are very different from those at large scales. A fundamental understanding of the phonon transport is required for device-level thermal analysis. Furthermore, understanding heat transfer in microfluidics is necessary to enable reliable device cooling at the micro- and nanoscales. Additional discussions will be given in subsequent chapters of the book. The progress in microelectronics is not possible without the advances in materials such as crystal growth and thermal processing during semiconductor manufacturing, as well as the deposition and photolithographic technologies. Rapid thermal processing (RTP) is necessary during annealing and oxidation to prevent ions from deep diffusion into the wafer. Thermal modeling of RTP must consider the combined conduction, convection, and radiation modes. A lightpipe thermometer is commonly used to monitor the temperature of the wafer. In an RTP furnace, the thermal radiation emitted by the wafer is collected by the light pipe and then transmitted to the radiometer for inferring the surface temperature.7 In some cases,

8

CHAPTER 1

the wafer surface is rough with anisotropic features. A better understanding of light scattering by anisotropic rough surfaces is also necessary. According to the International Technology Roadmap for Semiconductors, the gate length and the junction depth will be 25 and 13.8 nm, respectively, for the 65-nm devices used in high-performance complementary-metal-oxide-semiconductor (CMOS) technology.8 High-intensity Ar or Xe arc lamps with millisecond optical pulses are considered as a suitable annealing tool following ion implantation in ultra-shallow junction fabrication. Because the optical energy is absorbed within milliseconds, thermal diffusion cannot distribute heat uniformly across the wafer surface. Therefore, temperature uniformity across the nanometer-patterned wafer is expected to be a critical issue. To reduce the feature size further, deep-UV lithography and x-ray lithography have also been developed. It is predicted that Moore’s law will reach its limit in 2017, when the critical dimensions would be less than 10 nm. Further reduction will be subjected to serious barriers due to problems associated with gate dielectrics and fabrication difficulties. Molecular nanoelectronics using self-assembly is sought as an alternative, along with quantum computing. Therefore, nanoelectronics and quantum computing are anticipated to brighten the electronics and computer future.

1.3.2 Lasers, Optoelectronics, and Nanophotonics It is hard to imagine what the current technology would look like without lasers. Lasers of different types have tremendous applications in metrology, microelectronics fabrication, manufacturing, medicine, and communication. Examples are laser printers, laser bar code readers, laser Doppler velocimetry, laser machining, and laser corneal surgery for vision correction. The concept of laser was demonstrated in late 1950s independently in the United States and the Soviet Union during the cold war. The Nobel Prize in Physics of 1964 recognized the fundamental contributions in the field of quantum electronics by Charles Townes, Nicolay Basov, and Aleksandr Prokhorov. The first working laser was a Ruby laser built by Theodore Maiman at Hughes Aircraft Company in 1960. The principle of lasers dates back to 1917, when Einstein elegantly depicted his conception of stimulated emission of radiation by atoms. Unlike thermal emission and plasma emission, lasers are coherent light sources and, with the assistance of optical cavity, lasers can emit nearly monochromatic light and point to the same direction with little divergence. Lasers enabled a branch of nonlinear optics, which is important to understand the fundamentals of light-matter interactions, communication, as well as optical computing. In 1981, Nicolaas Bloembergen and Arthur Schawlow received the Nobel Prize in Physics for their contributions in laser spectroscopy. There are a variety of nonlinear spectroscopic techniques, including Raman spectroscopy, as reviewed by Fan and Longtin.9 Two-photon spectroscopy has become an important tool for molecular detection.10 Furthermore, two-photon 3-D lithography has also been developed for microfabrication.11 Gas lasers such as He-Ne (red) and Ar (green) have been extensively used for precision alignment, dimension measurements, and laser Doppler velocimetry due to their narrow linewidth. On the other hand, powerful Nd:YAG and CO2 lasers are used in thermal manufacturing, where the heat transfer processes include radiation, phase change, and conduction.12 Excimer lasers create nanosecond pulses in ultraviolet and have been extensively used in materials processing, ablation, eye surgery, dermatology, as well as photolithography in microelectronics and microfabrication. High-energy nanosecond pulses can also be produced by Q-switching, typically with a solid state laser such as Nd:YAG laser at a wavelength near 1 m. On the other hand, mode-locking technique allows pulse widths from picoseconds down to a few femtoseconds. Pulse durations less than 10 fs have been achieved since 1985. Ultrafast lasers have enabled the study of reaction dynamics and formed a branch in chemistry called femtochemistry. Ahmed Zewail of Caltech received the

INTRODUCTION

9

1999 Nobel Prize in Chemistry for his pioneering research in this field. In 2005, John Hall and Thoedor Hänsch received the Nobel Prize in Physics for developing laser-based precision spectroscopy, in particular, the frequency comb technique. Short-pulse lasers can facilitate fabrication, the study of electron-phonon interaction in the nonequilibrium process, measurement of thermal properties including interface resistance, nondestructive evaluation of materials, and so forth.13–16 Room-temperature continuous-operation semiconductor lasers were realized in May 1970 by Zhores Alferov and coworkers at the Ioffe Physical Institute in Russia, and independently by Morton Panish and Izuo Hayashi at Bell Labs a month later. Alferov received the Nobel Prize in Physics in 2000, together with Herbert Kroemer who conceived the idea of double-heterojunction laser in 1963 and was also an earlier pioneer of molecular beam epitaxy (MBE). Invented in 1968 by Alfred Cho and John Arthur at Bell Labs and developed in the 1970s, MBE is a high-vacuum deposition technique that enables the growth of highly pure semiconductor thin films with atomic precision. The name heterojunction refers to two layers of semiconductor materials with different bandgaps, such as GaAs/AlxGa1-x As pair. In a double-heterojunction structure, a lower-bandgap layer is sandwiched between two higher-bandgap layers.17 When the middle layer is made thin enough, on the order of a few nanometers, the structure is called a quantum well because of the discrete energy levels and enhanced density of states. Quantum well lasers can have better performance with a smaller driving current. Multiple quantum wells (MQWs), also called superlattices, that consist of periodic structures can also be used to further improve the performance. In a laser setting, an optical cavity is needed to confine the laser bandwidth as well as enhance the intensity at a desired wavelength with narrow linewidth. Distributed Bragg reflectors (DBRs) are used on both ends of the quantum well (active region). DBRs are the simplest photonic crystals made of periodic dielectric layers of different refractive indices; each layer thickness is equal to a quarter of the wavelength in that medium (l/n). DBRs are dielectric mirrors with nearly 100% reflectance, except at the resonance wavelength l, where light will eventually escape from the cavity. Figure 1.4 illustrates a vertical cavity surface emitting laser (VCSEL), where light is emitted through the substrate (bottom of the structure). The energy transfer mechanisms through phonon waves and electron waves have been extensively investigated.18

FIGURE 1.4 Schematic of a VCSEL laser made of heterogeneous quantum well structure. The smaller layer thickness can be 3 nm, and there can be as many as several hundred layers.

10

CHAPTER 1

Further improvement in the laser efficiency and control of the wavelength has been made using quantum wires and quantum dots (QDs).17 Semiconductor lasers are the most popular lasers (in quantity), and several hundred-million units are sold each year. Their applications include CD/DVD reading/writing, optical communication, laser pointers, laser printers, bar code readers, and so forth. A simpler device is the light-emitting diode (LED), which emits incoherent light with a two-layer p-n junction without DBRs. LEDs have been used for lighting, including traffic lights with improved efficiency and decorating lights. The development of wide-bandgap materials, such as GaN and AlN epitaxially grown through metal-organic chemical vapor deposition (MOCVD), allows the LED and semiconductor laser wavelength to be pushed to the blue and ultraviolet. Organic light-emitting diodes (OLEDs) based on electroluminescence are being developed as a promising candidate for the next-generation computer and TV displays. Alongside the development of light sources, there have been continuous development and improvement in photodetectors, mainly in focal plane arrays, charge-coupled devices (CCDs), quantum well detectors, readout electronics, data transfer and processing, compact refrigeration and temperature control, and so forth. On the other hand, optical fibers have become an essential and rapidly growing technology in telecommunication and computer networks. The optical fiber technology for communication was developed in the 1970s along with the development of semiconductor lasers. In 1978, Nippon Telegraph and Telephone (NTT) demonstrated the transmission of 32 Mbps (million-bits-per-second) through 53 km of graded-index fiber at 1.3-m wavelength. By 2001, 3 1011 m of fiberoptic wires had been installed worldwide; this is a round-trip from the earth to the sun. In March 2006, NEC Corporation announced a 40-Gbps optical-fiber transmission system. Optical fibers have also been widely applied as sensors for biochemical detection as well as temperature and pressure measurements. Fiber drawing process involves complicated heat transfer and fluid dynamics at different length scales and temperatures.19 Nanophotonics is an emerging frontier that integrates photonics with physics, chemistry, biology, materials science, manufacture, and nanotechnology. The foundation of nanophotonics is to study interactions between light and matter, to explore the unique characteristics of nanostructures for utilizing light energy, and to develop novel nanofabrication and sensing techniques. Recent studies have focused on photonic crystals, nanocrystals, plasmonic waveguides, nanofabrication and nanolithography, light interaction with organic materials, biophotonics, biosensors, quantum electrodynamics, nanocavities, quantum dot and quantum wire lasers, solar cells, and so forth. Readers are referred to Prasad20 for an extensive discussion of the recent developments. In the field of thermal radiation, nanoscale radiative transfer and properties have become an active research area, and a special issue of the Journal of Heat Transfer is devoted to this exciting area.21 1.3.3 Microfabrication and Nanofabrication Richard Feynman, one of the best theoretical physicists of his time and a Nobel Laureate in Physics, delivered a visionary speech at Caltech in December 1959, entitled “There’s plenty of room at the bottom.” At that time, lasers had never existed and integrated circuits had just been invented and were not practically useful, and a single computer that is not as fast as a present-day handheld calculator would occupy a whole classroom with enormous heat generation. Feynman envisioned the future of controlling and manipulating things on very small scales, such as writing (with an electron beam) the whole 24 volumes of Encyclopedia Britannica on the head of a pin and rearranging atoms one at a time.22 Many of the things Feynman predicted were once considered scientific fictions or jokes but have been realized in practice by now, especially since the 1980s. In 1983, Feynman gave a second talk about the use of swimming machine as a medical device: the surgeon that you could swallow, as well as quantum computing.22 In the 1990s, micromachining and MEMS

INTRODUCTION

11

emerged as an active research area, with a great success by the commercialization of the micromachined accelerometers in the automobile airbag. Using the etching and lithographic techniques, engineers were able to manufacture microscopic machines with moving parts, as shown in Fig. 1.5, such as gears with a size less than the cross-section of human hair.

FIGURE 1.5 MEMS structures. (a) A dust mite on a microfabricated mirror assembly, where the gears are smaller than the thickness of human hair. (b) Drive gear chain with linkages, where coagulated red blood cells are on the upper left and the lower right and a grain of pollen is on the upper right. (Courtesy of Sandia National Laboratories, SUMMiT Technologies, www.mems.sandia.gov.)

The technologies used in microfabrication have been extensively discussed in the text of Madou.23 These MEMS devices were later developed as tools for biological and medical diagnostics, such as the so-called lab-on-a-chip, with pump, valve, and analysis sections on the 10 to 100 m scale. In aerospace engineering, an application is to build micro–air vehicles or microflyers, with sizes ranging from a human hand down to a bumblebee that could be used for surveillance and reconnaissance under extreme conditions. Microchannels and microscale heat pipes have also been developed and tested for electronic cooling applications. The study of microfluidics has naturally become an active research area in mechanical engineering. The development of SPM and MEMS technologies, together with materials development through self-assembly and other technologies, lead to further development of even smaller structures and the bottom-up approach of nanotechnology. Laserbased manufacturing, focused ion beam (FIB), and electron-beam lithography have also been developed to facilitate nanomanufacturing. In NEMS, quantum behavior becomes important and quantum mechanics is inevitable in understanding the behavior. Robert Curl, Harold Kroto, and Richard Smalley were winners of the Nobel Prize in Chemistry in 1996 for their discovery of fullerenes in 1985 at Rice University, during a period Kroto visited from University of Sussex. The group used pulsed laser irradiation to vaporize graphite and form carbon plasma in a pressurized helium gas stream. The result as diagnosed by time-of-flight mass spectroscopy suggested that self-assembled C60 molecules were formed and would be shaped like a soccer ball with 60 vertices made of the 60 carbon atoms; see Kroto et al., Nature, 318, 162 (1985). The results were confirmed later to be C60 molecules indeed with a diameter on the order of 1 nm with wave-particle duality. This type of carbon allotrope is called a buckminsterfullerene, or fullerene, or buckyball, after the famous architect Buckminster Fuller (1895–1983) who designed geodesic domes. In 1991, Sumio Iijima of NEC Corporation synthesized carbon nanotubes (CNTs) using

12

CHAPTER 1

arc discharge. Soon his group and an IBM group were able to produce single-walled carbon nanotubes (SWNTs) with a diameter on the order of 1 nm. There have been intensive studies of CNTs for hydrogen storage, nanotransistors, field emission, light emission and absorption, quantum conductance, nanocomposites, and high thermal conductivity. Figure 1.6a shows CNTs growth at a room-temperature environment by chemical vapor deposition on a heated cantilever tip with a size around 5 m.24 Figure 1.6b shows the synthesized SWNTs with encapsulated metallofullerenes of Gd:C82 (i.e., a gadolinium inside a fullerene molecule). The high-resolution transmission electron microscope (TEM) image suggests that the diameter of the SWNT is from 1.4 to 1.5 nm.25 It should be noted that electron microscopes, including SEM and TEM, have become a powerful tool for imaging micro/nanoscale objects with a magnification up to 2 million. The first electron microscope was built by Ernst Ruska and Max Knoll in Germany during the early 1930s, and Ruska shared the Nobel Prize in Physics in 1986 for his contributions to electron optics and microscopy. Various nanostructured materials have been synthesized, such as silicon nanowires, InAs/GaAs QDs, and Ag nanorods. Figure 1.6c shows some images for nanohelices or nanosprings made of ZnO nanobelts or nanoribbons using a solid-vapor process.26 These self-assembled structures under controlled conditions could be fundamental to the study of electromagnetic coupled nanodevices for use as sensors and actuators, as well as the growth dynamics at the nanoscale. One of the successful technologies that operate in the regime of quantum mechanical domain is the giant magnetoresistive (GMR) head and hard drive. The GMR head is based on ferromagnetic layers separated by an extremely thin (about 1 nm) nonferromagnetic spacer, such as Fe/Cr/Fe and Co/Cu/Co. MBE enabled the metallic film growth with required precision and quality. The electrical resistance of GMR materials depends strongly on the applied magnetic field, which affects the spin states of electrons. IBM first introduced this technology in 1996, which was only about 10 years after the publication of the original research results; see Grünberg et al., Phys. Rev. Lett., 57, 2442 (1986); Baibich et al., Phys. Rev. Lett., 61, 2472 (1988). GMR materials have been extensively used in computer hard drive and read/write head. Overheating, due to friction with the disk surface, can render the data unreadable for a short period until the head temperature stabilizes; such an effect is called thermal asperity. Yang et al. performed a detailed thermal characterization of Cu/CoFe superlattices for GMR head applications using MEMS-based thermal metrology tools.27

1.3.4 Probing and Manipulation of Small Structures Tunneling by elementary particles is a quantum mechanical phenomenon or wavelike behavior. It refers to a potential barrier of the particles that normally will confine the particles to either side of the barrier, like a mountain that is so high as to separate people on one side from those on the other. When the barrier thickness is thin enough, quantum tunneling can occur and particles can transmit through the barrier, as if a tunnel is dug through a mountain. An example is an insulator between two metal strips. Trained in mechanical engineering, Ivar Giaever performed the first tunneling experiment with superconductors in 1960 at the General Electric Research Laboratory and received the 1973 Nobel Prize in Physics, together with Leo Esaki of IBM and Brian Josephson. Esaki made significant contributions in semiconductor tunneling, superlattices, and the development of MBE technology. He invented a tunneling diode, called the Esaki diode, which is capable of very fast operation in the microwave region. Josephson further developed the tunneling theory and a device, called a Josephson junction, which is used in the superconducting quantum interface devices (SQUIDs), for measuring extremely small magnetic fields. SQUIDs are used in magnetic resonance imaging (MRI) for medical diagnostics.

INTRODUCTION

13

FIGURE 1.6 Examples of nanostructures. (a) SEM image of CNTs grown on heated cantilever tip. (Reprinted with permission from Sunden et al.,24 copyright 2006, American Institute of Physics.) (b) Buckyballs inside a SWNT (the lower is a TEM image in which the nanotube diameter is 1.4 to 1.5 nm). (Reprinted with permission from Hirahara et al.,25 copyright 2000, American Physical Society.) (c) TEM images of ZnO nanobelts that are coiled into nanohelices or nanosprings. [Reprinted with permission from Gao et al.,26 copyright 2005, AAAS (image courtesy of Prof. Z. L. Wang, Georgia Tech).]

14

CHAPTER 1

In 1981, Gerd Binnig and Heinrich Rohrer of IBM Zurich Research Laboratory developed the first scanning tunneling microscope (STM) based on electron tunneling through vacuum. This invention has enabled the detection and manipulation of surface phenomena at the atomic level and, thus, has largely shaped the nanoscale science and technology through further development of similar instrumentation. Binnig and Rohrer shared the Nobel Prize in Physics in 1986, along with Ruska who developed the first electron microscope as mentioned earlier. STM uses a sharp-stylus-probe tip and piezoelectricity for motion control. When the tip is near 1 nm from the surface, an electron can tunnel through the tip to the conductive substrate. The tunneling current is very sensitive to the gap. Therefore, by maintaining the tip in position and scanning the substrate in the x-y direction with a constant current (or distance), the height variation can be obtained with extremely good resolution (0.02 nm). Using STM, Binnig et al. soon obtained the real-space reconstruction of the 7 7 unit cells of Si(111).28 In 1993, another group at IBM Almaden Research Center was able to manipulate iron atoms to create a 48-atom quantum corral on a copper substrate.29 The images have appeared in the front cover of many magazines, including Science and Physics Today. STM can also be used to assemble organic molecules and to study DNA molecules.2 In 1996, Gerd Binnig, Calvin Quate, and Christoph Gerber developed another type of SPM, i.e., the atomic force microscope (AFM) that can operate without a vacuum environment and for electrical insulators.30 AFM uses a tapered tip at the end of a cantilever and an optical position sensor, as shown in Fig. 1.7. The position sensor is very sensitive to the

FIGURE 1.7 Schematic of an atomic force microscope (AFM).

bending of the cantilever (with a 0.1-nm vertical resolution). When the tip is brought close to the surface, there exist intermolecular forces (repulsive or attractive) between the tip and the atoms on the underneath surface. In the contact mode, the cantilever is maintained in position using the servo signal from the position-sensing diode to adjust the height of the sample, while it scans in the lateral direction. Surface topographic data can be obtained in

INTRODUCTION

15

an ambient environment for nonconductive materials. Other SPMs have also been developed and the family of SPMs is quite large today. Wickramasinghe and coworkers first investigated thermal probing by attaching a thermocouple to the cantilever tip.31 Later, Arun Majumdar’s group developed several types of scanning thermal microscope (SThM) for nanoscale thermal imaging of heated samples, including microelectronic devices and nanotubes.32 Recently, researchers have modified SThM for measuring and mapping thermoelectric power at nanoscales.33 Because of its simplicity, AFM has become one of the most versatile tools in nanoscale research, including friction measurements, nanoscale indentation, dip-pen nanolithography, and so forth. Heated cantilever tips were proposed for nanoscale indentation or writing on the polymethyl methacrylate (PMMA) surface, either using a laser or by heating the cantilever legs.34 The method was further developed to concentrate the heat dissipation to the tip by using heavily doped legs as electrical leads, resulting in writing (with a density near 500 Gb/in2) and erasing (with a density near 400 Gb/in2) capabilities. The temperature signal measured by the tip resistance can also be used to read the stored data due to the difference in heat loss as the tip scans the area.35 In an effort to improve the data-writing speed, IBM initiated the “millipede” project in 2000 and succeeded in making 32 32 heated-cantilever array for which each cantilever was separately controlled.36 Obviously, heat transfer and mechanical characteristics are at the center of these systems. The heated AFM cantilever tips have been used as a local heating source for a number of applications, including the abovementioned CVD growth of CNTs locally and thermal dip-pen nanolithography.37

1.3.5 Energy Conversion Devices Nanostructures may have unique thermal properties that can be used to facilitate heat transfer for heat removal and thermal management applications. An example was mentioned earlier to utilize nanotubes with high thermal conductivity, although nanotube bundles often suffer from interface resistance and phonon scattering by defects and boundaries. Recently, there have been a number of studies on nanofluids, which are liquids with suspensions of nanostructured solid materials, such as nanoparticles, nanofibers, and nanotubes.38 Enhanced thermal conductivity and increased heat flux have been demonstrated; however, the mechanisms that contributed to the enhancement and temperature dependence are still being debated.39 Thermoelectricity utilizes the irreversible thermodynamics principle for thermal-electrical conversion and can be used for cooling in microelectronics as well as miniaturized power generation. A critical issue is to enhance the figure of merit of performance, with a reduced thermal conductivity. Multilayer heterogeneous structures create heat barriers due to size effects and the boundary resistance. These structures have been extensively studied in the literature and demonstrate enhanced performances. Understanding the thermal and electrical properties of heterogeneous structures is critically important for future design and advancement.40 Fast-depleting reserves of conventional energy sources have resulted in an urgent need for increasing energy conversion efficiencies and recycling of waste heat. One of the potential candidates for fulfilling these requirements is thermophotovoltaic devices, which generate electricity from either the complete combustion of different fuels or the waste heat of other energy sources, thereby saving energy. The thermal radiation from the emitter is incident on a photovoltaic cell, which generates electrical currents. Applications of such devices range from hybrid electric vehicles to power sources for microelectronic systems. At present, thermophotovoltaic systems suffer from low conversion efficiency. Nanostructures have been extensively used to engineer surfaces with designed absorption, reflection, and emission characteristics. Moreover, at the nanoscale, the radiative energy transfer can be greatly enhanced due to tunneling and enhanced local density of states. A viable solution to

CHAPTER 1

16

increase the thermophotovoltaic efficiency is to apply microscale radiation principles in the design of different components to utilize the characteristics of thermal radiation at small distances and in microstructures.41 Nanostructures can also help increase the energy conversion efficiency and reduce the cost of solar cells. Figure 1.8 shows the device structure of a Platinized electrode Dye-coated nanowire array in electrodyte

Transparent electrode

e−

(a)

5 µm (b) FIGURE 1.8 ZnO nanowires for dye-sensitized solar cells, from Law et al.42 (Reprinted by permission from Macmillan Publishers Ltd.: Nature Materials, copyright 2005.) The height of the wires is near 16 m and their diameters vary between 130 and 200 nm. (a) Schematic of the cell with light incident through the bottom electrode. (b) SEM image of a cleaved nanowire array.

ZnO-nanowire array for dye-sensitized solar cells.42 This structure can greatly enhance the absorption or quantum efficiency over nanoparticle-based films. Knowledge of the spectral and directional absorptance of nanostructures and heat dissipation mechanisms is critically important for further advancement of this type of device.43 Hydrogen technologies are being considered and actively pursued as the energy source of the future. There are two ways in which hydrogen H2 may be used: one is in a combustion heat engine where hydrogen reacts with oxygen intensively while releasing heat; the other is in a fuel cell where electrochemical reaction occurs quietly to generate electricity just like a battery. Because the only reaction product is water, hydrogen-powered automobiles can be made pollution free in principle. Grand challenges exist in generation, storage,

INTRODUCTION

17

and transport of hydrogen. If all hydrogen is obtained from fossil fuels, there will be no reduction in either the fossil fuel consumption or the carbon dioxide emission, except that the emission is centralized in the hydrogen production plant. Alternatively, hydrogen may be produced from water with other energy sources, such as renewable energy sources. Nanomaterials are being developed for several key issues related to hydrogen technologies, such as hydrogen storage using nanoporous materials, effective hydrogen generation by harvesting solar energy with inexpensive photovoltaic materials, and fuel cells based on nanostructure catalysts .44

1.3.6 Biomolecule Imaging and Molecular Electronics Optical microscopy has played an instrumental role in medical diagnoses because it allows us to see bacteria and blood cells. Optical wavelength is more desirable than x-ray or electron beam because of the less invasiveness and the more convenience. However, the resolution of a traditional microscope is on the order of half the wavelength due to the diffraction limit. While the concept of near-field imaging existed in the literature before 1930, it has been largely forgotten because of the inability in building the structures and controlling their motion. With the microfabrication and precision-positioning capabilities, near-field scanning optical microscopes (NSOMs, also called SNOMs) were realized in the early 1980s by different groups and extensively used for biomolecule imaging with a resolution of 20 to 50 nm.45 The principle is to bring the light through an aperture of a tapered fiber of very small diameter at the end or to bring the light through an aperture of very small diameter. The beam out from the fiber tip or aperture will diverge quickly if the sample is placed in the far field, i.e., away from the aperture. However, high resolution can be achieved by placing the sample in close proximity to the aperture within a distance much less than the wavelength, i.e., in the near filed, such that the beam size is almost the same as the aperture. An apertureless metallic tip can be integrated with an SPM to guide the electromagnetic wave via surface plasmon resonance with a spatial resolution as high as 10 nm, for high-resolution imaging and processing. There have since been extensive studies on near-field interactions between electromagnetic waves and nanostructured materials, from semiconductor QDs, metallic nanoaperture and nanohole arrays, to DNA and RNA structures. Nanoparticles are among the earliest known nanostructures that have been used for centuries in making stained glass with gold or other metallic nanoparticles as well as photographic films with silver nanoparticles. A QD has a spherical core encapsulated in a shell made of another semiconductor material, such as a CdSe core in a ZnS shell. The outer shell is only several monolayers thick, and the diameters of QDs range from 2 to 10 nm. The material for the inner core has a smaller bandgap. Quantum confinement in the core results in sizedependent fluorescent properties. Compared with molecular dyes conventionally used for fluorescent labeling in cellular imaging, the emission from QD fluorophores is brighter with a narrower spectral width. QDs also allow excitation at shorter wavelengths, making it easier to separate the fluorescent signal from the scattered one, and are resistive to photobleaching that causes dyes to lose fluorescence. Furthermore, the emission wavelength can be selected by varying the core size of QDs to provide multicolor labeling. It was first demonstrated in 1998 that QDs could be conjugated to biomolecules such as antibodies, peptides, and DNAs, enabling surface passivation and water solubility. In recent years, significant development has been made to employ QDs for in vivo and in vitro imaging, labeling, and sensing.46,47 CMOS technology is a top-down semiconductor fabrication process, in which patterns are created by first making a mask and then printing the desired features onto the surface of the wafer via lithography. Integrated circuits have dominated the technological and economic progress in the past 30 years, and complex and high-density devices have been manufactured on silicon wafers. However, this technology is going to reach a limit in 10 to 15 years, when the smallest feature size is less than 10 nm. Molecular electronics is considered

18

CHAPTER 1

as a promising alternative.48 A 3-D assembly with short interconnect distances would greatly increase the information storage density and transfer speed with reduced power consumption and amount of heat dissipated. Self-assembly means naturally occurring processes, from biological growth to the galaxy formation. In materials synthesis, selfassembly implies that the end products or structures are formed under favorable conditions and environments. An example is the growth of bulk crystals from a seed. Fullerenes and nanotubes are formed by self-assembling, not by slicing a graphite piece and then rolling and bending it to the shape of a tube or a shell. Self-assembly is referred to as a bottom-up process, like constructing an airplane model with LEGO pieces. Biological systems rely on self-assembly and self-replication to develop. Since 2000, CNT-based transistors have been built by several groups and found to be able to outperform Si-based ones. Transistors have also been created using a single molecule of a transition-metal organic complex nanobridge between two electrodes.49 Because of the small dimensions, quantum mechanics should govern the electrical and mechanical behaviors. Figure 1.9 illustrates an engineered

FIGURE 1.9 An engineered DNA strand between metal-atom contacts that could function as a molecular electronics device. (Courtesy of NASA Ames Center of Nanotechnology, http://ipt.arc.nasa.gov.)

DNA strand between metallic atoms, noting that the width of a DNA strand is around 2 nm. Such a structure could function as a sensor and other electronic components. Molecular electronics, while at its infancy, is expected to revolutionize electronics industry and to enable continuous technological progress through the twenty-first century.

INTRODUCTION

19

Nano/microscale research and discoveries have been instrumental to the development of technologies used today in microelectronics, photonics, communication, manufacture, and biomedicine. However, systematic and large-scale government investment toward nanoscience and engineering did not start until late 1990s, when the Interagency Working Group on Nanoscience, Engineering, and Technology (IWGN) was formed under the National Science and Technology Council (NSTC). The first report was released in fall 1999, entitled “Nanostructure Science and Technology,” followed by the report, “Nanotechnology Research Directions.” In July 2000, NSTC published the “National Nanotechnology Initiative (NNI).” A large number of nanotechnology centers and nanofabrication facilities have been established since then; see www.nano.gov. In the United States, the government spending on nanotechnology R&D exceeded $1 billion in 2005, as compared to $464 million in 2001 and approximately $116 million in 1997. The total government investment worldwide was over $4 billion in 2005, and Japan and European countries invested similar amount of money as the United States did. Recognizing the increasing impact on engineering and science, the American Society of Mechanical Engineers established the ASME Nanotechnology Institute in mid-2001 and sponsored a large number of international conferences and workshops; see http://nano.asme.org/. Understanding the thermal transport and properties at the nanoscale is extremely important as mentioned earlier. Engineers have the responsibility to transfer the basic science findings into technological advances, to design and develop better materials with desired functions, to build systems that integrate from small to large scales, to perform realistic modeling and simulation that facilitate practical realization of improved performance and continuously reduced cost, and to conduct quantitative measurements and tests that determine the materials properties and system performance. Like any other technology, nanotechnology may also have some adverse effects, such as toxic products and biochemical hazards, which are harmful to human health and the environment. There are also issues and debates concerning security, ethics, and religion. Government and industry standard organizations, as well as universities, have paid great attention to the societal implications and education issues in recent years. Optimists believe that we can harness nanobiotechnology to improve the quality of human life and benefit social progress, while overcoming the adverse effects, like we have done with electricity, chemical plants, and space technology.

1.4 OBJECTIVES AND ORGANIZATION OF THIS BOOK Scientists, engineers, entrepreneurs, and lawmakers must work together for the research outcomes to be transferred into practical products that will advance the technology and benefit society. Nanotechnology is still in the early stage and holds tremendous potential; therefore, it is important to educate a large number of engineers with a solid background in nanoscale analysis and design so that they will become tomorrow’s leaders and inventors. There is a growing demand of educating mechanical engineering students at both the graduate and undergraduate levels with a background in thermal transport at micro/nanoscales. Micro/nanoscale heat transfer courses have been introduced in a number of universities; however, most of these courses are limited at the graduate level. While an edited book on Microscale Energy Transport has been available since 1998,3 it is difficult to use as a textbook due to the lack of examples, homework problems, and sufficient details on each subject. Some universities have introduced nanotechnology-related courses to the freshmen and sophomores, with no in-depth coverage on the fundamentals of physics. A large number of institutions have introduced joint mechanical-electrical engineering courses on MEMS/NEMS, with a focus on device-level manufacturing and processing technology. To understand the thermal transport phenomena and thermophysical properties at small length

20

CHAPTER 1

scales, the concepts of quantum mechanics, solid state physics, and electrodynamics are inevitable. These concepts, however, are difficult to comprehend by engineering students. The aim of this book is to introduce the much needed physics knowledge without overwhelming mathematical operators or notions that are unfamiliar to engineering students. Therefore, this book can be used as the textbook not only in a graduate-level course but also in a tech elective for senior engineering undergraduates. While the book contains numerous equations, the math requirement mostly does not exceed engineering calculus including series, differential and integral equations, and some vector and matrix algebra. The reason to include such a large number of equations is to provide necessary derivation steps, so that readers can follow and understand clearly. This is particularly helpful for practicing engineers who do not have a large number of references at hand. The emphasis of this book is placed on the fundamental understanding of the phenomena and properties: that is, why do we need particular equations and how can we apply them to solve thermal transport problems at the prescribed length and time scales? Selected and refined examples are provided that are both practical and illustrative. At the end of each of the remaining nine chapters, a large number of exercises are given at various levels of complexity and difficulty. Numerical methods are not presented in this book. Most of the problems can be solved with a personal computer using a typical software program or spreadsheet. For course instructors, the solutions of many homework problems can be obtained from the author. The field of micro/nanoscale heat transfer was cultivated and fostered by Professor Chang-Lin Tien beginning in the late 1980s, along with the rapid development in microelectronics, MEMS, and nanotechnology. His long-lasting and legendary contributions to the thermal science research have been summarized in the recent volume of Annual Review of Heat Transfer.50 As early as in the 1960s, Professor Tien investigated the fundamentals of the radiative properties of gas molecules, the size effect on the thermal conductivity of thin films and wires, and radiation tunneling between closely spaced surfaces. He published (with John H. Lienhard) a book in 1971, titled Statistical Thermodynamics, which provided inspiring discussions on early quantum mechanics and models of thermal properties of gases, liquids, and crystalline solids. While thermodynamics is a required course for mechanical engineering students, the principles of thermodynamics cannot be understood without a detailed background in statistical thermodynamics. Statistical mechanics and kinetic theory are also critical for understanding thermal properties and transport phenomena. Chapter 2 provides an overview of equilibrium thermodynamics, heat transfer, and fluid mechanics. Built up from the undergraduate mechanical engineering curricula, the materials are introduced in a quite different sequence to emphasize thermal equilibrium, the second law of thermodynamics, and thermodynamic relations. The concept of entropy is rigorously defined and applied to analyze conduction and convection heat transfer problems in this chapter. It should be noted that, in Chap. 8, an extensive discussion is given on the entropy of radiation. Chapter 3 introduces statistical mechanics and derives the classical (Maxwell-Boltzmann) statistics and quantum (i.e., Bose-Einstein and Fermi-Dirac) statistics. The first, second, and third laws of thermodynamics are presented with a microscopic interpretation, leading to the discussion of Bose-Einstein condensate and laser cooling of atoms. The classical statistics are extensively used to obtain the ideal gas equation, the velocity distribution, and the specific heat. A concise presentation of elementary quantum mechanics is then provided. This will help students gain a deep understanding of the earlier parts of this chapter. For example, the quantization of energy levels and the energy storage mechanisms by translation, rotation, and vibration for modeling the specific heat of ideal polyatomic gases. The combined knowledge of quantum mechanics and statistical thermodynamics is important for subsequent studies. The concept of photon as an elementary particle and how it interacts with an atom are discussed according to Einstein’s 1917 paper on the atomic absorption and emission mechanisms. Finally, the special theory of relativity is briefly introduced to help understand the limitation of mass conservation and the generality of the law of energy conservation.

INTRODUCTION

21

Chapter 4 begins with a very basic kinetic theory of dilute gases and provides a microscopic understanding of pressure and shear. With the help of mean free path and average collision distance, the transport coefficients such as viscosity, thermal conductivity, and mass diffusion coefficient are described. Following a discussion of intermolecular forces, the detailed Boltzmann transport equation (BTE) is presented to fully describe hydrodynamic equations as well as Fourier’s law of heat conduction, under appropriate approximations. In the next section, the regimes of microflow are described based on the Knudsen number, and the current methods to deal with microfluidics are summarized. The heat transfer associated with slip flow and temperature jump is presented in more detail with a simple planar geometry. Then, gas conduction between two surfaces under free molecular flow is derived. These examples, while simple, capture some of the basics of microfluidics. No further discussion is given on properties of liquids or multiphase fluids. It should be noted that several books on microflow already exist in the literature. The next three chapters provide a comprehensive treatment of nano/microscale heat transfer in solids, with an emphasis on the physical phenomena as well as material properties. The materials covered in Chap. 5 are based on simple free-electron model, kinetic theory, and BTE without a detailed background of solid state physics, which is discussed afterward in Chap. 6. This not only helps students comprehend the basic, underlying physical mechanisms but also allows the instructor to integrate Chap. 5 into a graduate heat conduction course. For an undergraduate elective, Chap. 6 can be considered as reading material or reference without spending too much time going through the details in class. In Chap. 5, the theory of specific heat is presented with a detailed treatment on the quantum size effect. Similarly, the theory of thermal conductivity of metals and dielectric solids is introduced. Because of the direct relation between electrical and thermal conductivities and the importance of thermoelectric effects, irreversible thermodynamics and thermoelectricity are also introduced. The classical size effect on thermal conductivity due to boundary scattering is elaborated. Finally, the concept of quantum conductance (both electric and thermal) is introduced. Chapter 6 introduces the electronic band structures and phonon dispersion relations in solids. It helps understand semiconductor physics and some of the difficulties of free-electron model for metals. Photoemission, thermionic emission, and electron tunneling phenomena are introduced. The electrical transport in semiconductors is described with applications in energy conversion and optoelectronic devices. Chapter 7 focuses on nonequilibrium energy transport in nanostructures, including non-Fourier equations for transient heat conduction. The equation of phonon radiative transfer is presented and solved for thin-film and multilayer structures. The phenomenon of thermal boundary resistance is studied microscopically. A regime map is developed in terms of the length scale and the time scale from macroscale to microscale to nanoscale heat conduction. Additional reading materials regarding multiscale modeling, atomistic modeling, and thermal metrology are provided as references. The last three chapters give comprehensive discussion on nano/microscale radiation with extensive background on the fundamentals of electromagnetic waves, the optical and thermal radiative properties of materials and surfaces, and the recent advancement in nanophotonics and nanoscale radiative transfer. Chapter 8 presents the Maxwell equations of electromagnetic waves and the derivation of Planck’s law and radiation entropy. The electric and magnetic properties of the newly developed class of materials, i.e., negativerefractive-index materials are also discussed. More extensive discussion of the radiative properties of thin films, gratings, and rough surfaces is given in Chap. 9. The wave interference, partial coherence, and diffraction phenomena are introduced with detailed formulations. In Chap. 10, attention is given to the evanescent wave, coupling and localization, surface plasmon polaritons, surface phonon polaritons, and near-field energy transfer. This chapter contains the most recent developments in near-field optics, nanophotonics, and nanoscale radiative transfer. These advancements will continue to impact on the energy conversion devices, sensors, and nanoscale photothermal manufacturing.

22

CHAPTER 1

It is noteworthy that the book Nanoscale Energy Transfer and Conversion, by Professor G. Chen, has recently been published.4 In his book, a parallel treatment is presented to deal with electron, molecule, phonon, and photon transport processes. Such a parallel treatment places emphasis on the similarity and analogy between different energy carriers and transport mechanisms. While the approaches are unique and interesting, it is difficult for use as a textbook at the entry level without some preliminary solid state physics and statistical thermodynamics background. The present book places materials within the context of each topic by presenting statistical thermodynamics, kinetic theory of ideal gases and microfluidics, electrons and phonons in solids, and electromagnetic waves and their interactions with nanomaterials in separate chapters. In addition to the differences in the organization and presentation, the coverage of the present text differs to some extent from Chen’s book. The present book contains much more extensive discussion on statistical thermodynamics and nanoscale thermal radiation, while Chen’s book includes additional chapters on liquids and their interfaces as well as molecular dynamics simulation. As a result, the two books complement each other in terms of the coverage and organization. It is hoped that the present text can be used either as a whole in a one-semester course, or in part for integration into an existing thermal science course for several weeks on a particular topic. Examples are graduate-level thermodynamics (Chaps. 2 and 3), convection heat transfer (Chap. 4), conduction heat transfer (Chaps. 5 and 7), and radiation heat transfer (Chaps. 8 and 9). Selected materials may also be used to introduce nanoscale thermal sciences in undergraduate heat transfer and fluid mechanics courses. Some universities offer a second course on thermodynamics at the undergraduate level for which statistical thermodynamics and quantum theory can also be introduced. This text can also be self-studied by researchers or practicing engineers, graduated from a traditional engineering discipline. A large effort is given to balance the depth with the breadth so that it is easy to understand and contains sufficient coverage of both the fundamentals and advanced developments in the field. Readers will gain the background necessary to understand the contemporary research in nano/microscale thermal engineering and to solve a variety of practical problems using the approaches presented in the text.

REFERENCES 1. C. P. Poole, Jr. and F. J. Owens, Introduction to Nanotechnology, Wiley, New York, 2003. 2. E. L. Wolf, Nanophysics and Nanotechnology—An Introduction to Modern Concepts in Nanoscience, Wiley-VCH, Weinheim, Germany, 2004. 3. C. L. Tien, A. Majumdar, and F. M. Gerner (eds.), Microscale Energy Transport, Taylor & Francis, Washington, DC, 1998. 4. G. Chen, Nanoscale Energy Transport and Conversion, Oxford University Press, New York, 2005. 5. C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., 27, 379–423, 623–656, July & October 1948. See http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html. 6. G. E. Moore, “Cramming more components onto integrated circuits,” Electronics, 38(8), 114–117, April 1965; G. E. Moore, “Progress in digital integrated electronics,” IEEE Tech. Digest (International Electron Devices Meeting), 11–13, 1975. www.intel.com/technology/mooreslaw. 7. B. K. Tsai, “A summary of lightpipe radiation thermometry research at NIST,” J. Res. Natl. Inst. Stand. Technol., 111, 9–30, 2006. 8. SIA, International Technology Roadmap for Semiconductors. www.itrs.net. 9. C. H. Fan and J. P. Longtin, “Radiative energy transport at the spatial and temporal micro/nano scales,” in Heat Transfer and Fluid Flow in Microscale and Nanoscale Structures, M. Faghri and B. Sunden (eds.), WIT Press, Southampton, UK, pp. 225–275, 2003. 10. W. Denk, J. H. Stricker, and W. W. Webb, “Two-photon laser scanning fluorescence microscopy,” Science, 248, 73–76, 1990. 11. T. Yu, C. K. Ober, S. M. Kuebler, W. Zhou, S. R. Marder, and J. W. Perry, “Chemicallyamplified positive resist system for two-photon three-dimensional lithography,” Adv. Mater, 15,

INTRODUCTION

12.

13. 14. 15.

16.

17. 18. 19.

20. 21. 22.

23. 24.

25.

26.

27.

28.

29.

23

517–521, 2003; S. M. Kuebler, K. L. Braun, W. Zhou, et al., “Design and application of highsensitivity two-photon initiators for three-dimensional microfabrication,” J. Photochem. Photobio. A: Chemistry, 158, 163–170, 2003. M. F. Modest and H. Abakians, “Heat-conduction in a moving semi-infinite solid subject to pulsed laser irradiation,” J. Heat Transfer, 108, 597–601, 1986; M. F. Modest and H. Abakians, “Evaporative cutting of a semi-infinite body with a moving cw laser,” J. Heat Transfer, 108, 602–607, 1986. C. L. Tien, T. Q. Qiu, and P. M. Norris, “Microscale thermal phenomena in contemporary technology,” Thermal Sci. Eng., 2, 1–11, 1994. R. J. Stoner and H. J. Maris, “Kapitza conductance and heat flow between solids at temperatures from 50 to 300 K,” Phys. Rev. B, 48, 16373–16387, 1993. W. S. Capinski, H. J. Maris, T. Ruf, M. Cardona, K. Ploog, and D. S. Katzer, “Thermal-conductivity measurements of GaAs/AlAs superlattices using a picosecond optical pump-and-probe technique,” Phys. Rev. B, 59, 8105–8113, 1999. P. M. Norris, A. P. Caffrey, R. Stevens, J. M. Klopf, J. T. McLeskey, and A. N. Smith, “Femtosecond pump-probe nondestructive evaluation of materials,” Rev. Sci. Instrum., 74, 400–406, 2003; R. J. Stevens, A. N. Smith, and P. M. Norris, “Measurement of thermal boundary conductance of a series of metal-dielectric interfaces by the transient thermoreflectance techniques,” J. Heat Transfer, 127, 315–322, 2005. O. Manasreh, Semiconductor Heterojunctions and Nanostructures, McGraw-Hill, New York, 2005. G. Chen, “Heat transfer in micro- and nanoscale photonic devices,” Annu. Rev. Heat Transfer, 7, 1–18, 1996. Y. Jaluria, “Thermal processing of materials: from basic research to engineering,” J. Heat Transfer, 125, 957–979, 2003; X. Cheng and Y. Jaluria, “Optimization of a thermal manufacturing process: drawing of optical fiber,” Int J. Heat Mass Transfer, 48, 3560–3573, 2005; C. Chen and Y. Jaluria, “Modeling of radiation heat transfer in the drawing of an optical fiber with multilayer structure,” J. Heat Transfer, 129, 342–352, 2007. P. N. Prasad, Nanophotonics, Wiley, New York, 2004. Z. M. Zhang and M. P. Menguç, “Guest editorial: special issue on nano/microscale radiative transfer,” J. Heat Transfer, 129, 1–2, 2007. R. P. Feynman, “There’s plenty of room at the bottom,” J. Microelectromechanical Systems, 1, 60–66, 1992; R.P. Feynman, “Infinitesimal Machinery,” J. Microelectromechanical Systems, 2, 4–14, 1993. www.zyvex.com/nanotech/feynman.html. M. J. Madou, Fundamentals of Microfabrication: The Science of Miniaturization, 2nd ed., CRC Press, Boca Raton, FL, 2002. E. O. Sunden, T. L. Wright, J. Lee, W. P. King, and S. Graham, “Room-temperature chemical vapor deposition and mass detection on a heated atomic force microscope cantilever,” Appl. Phys. Lett., 88, 033107, 2006. K. Hirahara, K. Suenaga, S. Bandow, et al., “One-dimensional metallofullerene crystal generated inside single-walled carbon nanotubes,” Phys. Rev. Lett., 85, 5384, 2000. Also see Phys. Rev. Focus, 19 December 2000 at http://focus.aps.org/story/v6/st27. P. X. Gao, Y. Ding, W. J. Mai, W. L. Hughes, C. S. Lao, and Z. L. Wang, “Conversion of zinc oxide nanobelt into superlattice-structured nanohelices,” Science, 309, 1700–1704, 2005; X. Y. Kong, Y. Ding, R. Yang, and Z. L. Wang, “Single-crystal nanorings formed by epitaxial selfcoiling of polar nanobelts,” Science, 309, 1348–1351, 2004. Y. Yang, W. Liu, and M. Asheghi, “Thermal and electrical characterization of Cu/CoFe superlattices,” Appl. Phys. Lett., 84, 3121–3123, 2004; Y. Yang, R. M. White, and M. Asheghi, “Thermal characterization of Cu/CoFe multilayer for giant magnetoresistive (GMR) head applications,” J. Heat Transfer, 128, 113–120, 2006. G. Binnig and H. Rohrer, “Scanning tunneling microscopy,” Helv. Phys. Acta, 55, 726–735, 1982; G. Binnig, H. Rohrer, Ch. Gerber, and E. Weibel, “Surface studies by scanning tunneling microscopy,” Phys. Rev. Lett., 49, 57–61, 1982; G. Binnig, H. Rohrer, Ch. Gerber, and E. Weibel, “7 7 reconstruction on Si(111) resolved in real space,” Phys. Rev. Lett., 50, 120–123, 1983. M. F. Crommie, C. P. Lutz, and D. M. Eigler, “Confinement of electrons to quantum corrals on a metal surface,” Science, 262, 218–220, 1993.

24

CHAPTER 1

30. G. Binnig, C. F. Quate, and Ch. Gerber, “Atomic force microscope,” Phys. Rev. Lett., 56, 930–933, 1986. 31. C. C. Williams and H. K. Wickramasinghe, “Scanning thermal profiler,” Appl. Phys. Lett., 49, 1587–89, 1986; J. M. R. Weaver, L. M. Walpita, and H. K. Wickramasinghe, “Optical absorption microscopy with nanometer resolution,” Nature, 342, 783–85, 1989; M. Nonnenmacher and H. K. Wickramasinghe, “Optical absorption spectroscopy by scanning force microscopy,” Ultramicroscopy, 42–44, 351–354, 1992. 32. A. Majumdar, “Scanning thermal microscopy,” Annu. Rev. Mater. Sci., 29, 505–585, 1999. 33. H.-K. Lyeo, A. A. Khajetoorians, L. Shi, et al., “Profiling the thermoelectric power of semiconductor junctions with nanometer resolution,” Science, 303, 818–820, 2004; Z. Bian, A. Shakouri, L. Shi, H.-K. Lyeo, and C. K. Shih, “Three-dimensional modeling of nanoscale Seebeck measurement by scanning thermoelectric microscopy,” Appl. Phys. Lett., 87, 053115, 2005. 34. H. J. Mamin and D. Rugar, “Thermomechanical writing with an atomic force microscope tip,” Appl. Phys. Lett., 61, 1003–1005, 1992; H. J. Mamin, “Thermal writing using a heated atomic force microscope tip,” Appl. Phys. Lett., 69, 433–435, 1996. 35. G. Binnig, M. Despont, U. Drechsler, et al., “Ultrahigh-density atomic force microscopy data storage with erase capability,” Appl. Phys. Lett., 74, 1329–1331, 1999; W. P. King, T. W. Kenny, K. E. Goodson, et al., “Atomic force microscope cantilevers for combined thermomechanical data writing and reading,” Appl. Phys. Lett., 78, 1300–1302, 2001. 36. U. Dürig, G. Cross, M. Despont, et al., “‘Millipede’—an AFM data storage system at the frontier of nanotechnology,” Tribology Lett., 9, 25–32, 2000; P. Vettiger, G. Cross, M. Despont, et al., “The ‘millipede’—nanotechnology entering data storage,” IEEE Trans. Nanotechnol., 1, 39–55, 2002. 37. P. E. Sheehan, L. J. Whitman, W. P. King, and B. A. Nelson, “Nanoscale deposition of solid inks via thermal dip pen nanolithography,” Appl. Phys. Lett., 85, 1589–1591, 2004. 38. J. A. Eastman, S. R. Phillpot, S. U. S. Choi, and P. Kablinski, “Thermal transport in nanofluids,” Annu. Rev. Mater. Res., 34, 219–246, 2004. 39. R. S. Prasher, P. Bhattacharya, and P. E. Phelan, “Thermal conductivity of nanoscale colloidal solutions (nanofluids),” Phys. Rev. Lett., 94, 025901, 2005; R. Prasher, P. Bhattacharya, and P. E. Phelan, “Brownian-motion-based convective-conductive model for the effective thermal conductivity of nanofluids,” J. Heat Transfer, 128, 588–595, 2006. 40. G. Chen and A. Shakouri, “Heat transfer in nanostructures for solid-state energy conversion,” J. Heat Transfer, 124, 242–252, 2002; H. Böttner, G. Chen, and R. Venkatasubramanian, “Aspects of thin-film superlattice thermoelectric materials, devices and applications,” MRS Bulletin, 31, 211–217, March 2006. 41. S. Basu, Y.-B. Chen, and Z. M. Zhang, “Microscale radiation in thermophotovoltaic devices—A review,” Int. J. Ener. Res., 31, in press, 2007. (Published online 6 Dec. 2006.) 42. M. Law, L. E. Greene, J. C. Johnson, R. Saykally, and P. Yang, “Nanowire dye-sensitized solar cells,” Nature Mater., 4, 455–459, 2005. 43. A. Mihi and H. Miguez, “Origin of light-harvesting enhancement in colloidal-photonic-crystal-based dye-sensitized solar cells,” J. Phys. Chem. B, 109, 15968–15976, 2005. 44. G. Crabtree, M. Dresselhaus, and M. Buchanan, “The hydrogen economy,” Physics Today, 39–44, December 2004. 45. A. Lewis, H. Taha, A. Strinkovski, et al., “Near-field optics: from subwavelength illumination to nanometric shadowing,” Nature Biotechnol., 21, 1378–1386, 2003. 46. X. Michalet, F. F. Pinaud, L. A. Bentolila, et al., “Quantum dots for live cells, in vivo imaging, and diagnostics,” Science, 307, 538–544, 2005. 47. I. L. Medintz, H. T. Uyeda, E. R. Goldman, and H. Mattoussi, “Quantum dot bioconjugates for imaging, labelling and sensing,” Nature Mater., 4, 435–446, 2005. 48. B. Yu and M. Meyyappan, “Nanotechnology: role in emerging nanoelectronics,” Solid-State Electronics, 50, 536–544, 2006. 49. S. De Franceschi and L. Kouwenhoven, “Electronics and the single atom,” Nature, 417, 701–702, 2002. 50. V. Prasad, Y. Jaluria, and G. Chen (eds.), Annual Review of Heat Transfer, Vol. 14, Begell House, New York, 2005.

CHAPTER 2

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

This chapter provides a concise description of the basic concepts and theories underlying classical thermodynamics and heat transfer. Different approaches exist in presenting the subject of thermodynamics. Most engineering textbooks first introduce temperature, then discuss energy, work, and heat, and define entropy afterward. Callen developed an axiomatic structure using a simple set of abstract postulates to combine the physical information that is included in the laws of thermodynamics.1 Continuing the effort pioneered by Keenan and Hatsopoulos,2 Gyftopoulos and Beretta3 developed a logical sequence to introduce the basic concepts with a rigorous definition of each thermodynamic term. Their book has been a great inspiration to the present author in comprehending and teaching thermodynamics. Here, an overview of classical thermodynamics is provided that is somewhat beyond typical undergraduate textbooks.4,5 Details on the historic development of classical thermodynamics can be found from Bejan6 and Kestin7, and references therein. The basic phenomena and governing equations in energy, mass, and momentum transfer will be presented subsequently in a self-consistent manner without invoking microscopic theories.

2.1 FUNDAMENTALS OF THERMODYNAMICS A system is a collection of constituents (whose amounts may be fixed or varied within a specified range) in a defined space (e.g., a container whose volume may be fixed or varied within a specified range), subject to other external forces (such as gravitational and magnetic forces) and constraints. External forces are characterized by parameters. An example is the volume of a container, which is a parameter associated with the forces that confine the constituents within a specified space. Everything that is not included in the system is called the environment or surroundings of the system. Quantities that characterize the behavior of a system at any instant of time are called properties of the system. Properties must be measurable and their values are independent of the measuring devices. Properties supplement constituents and parameters to fully characterize a system. At any given time, the system is said to be in a state, which is fully characterized by the types and amount of constituents, a set of parameters associated with various types of external forces, and a set of properties. Two states are identical if the amount of each type of constituents and values of all the parameters and properties are the same. A system may experience a spontaneous change of state, when the change of state does not involve any interaction between the system and its environment. If the

25 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

26

CHAPTER 2

system changes its state through interactions with other systems in the environment, it is said to experience an induced change of state. If a system can experience only spontaneous changes of state, it is said to be an isolated system, that is, the change of state of the system does not affect the environment of the system. The study of the possible and allowed states of a system is called kinematics, and the study of the time evolution of the state is called dynamics. The relation that describes the change of state of a system as a function of time is the equation of motion. In practice, the complete equations of motion are often not known. Therefore, in thermodynamics the description of the change of state is usually given in terms of the end states (i.e., the initial and final states) and the modes of interaction (for example, work and heat, which are discussed later). The end states and the modes of interaction specify a process. A spontaneous change of state is also called a spontaneous process. A process is reversible if there is at least one way to restore both the system and its environment to their initial states. Otherwise, the process is irreversible, i.e., it is not possible to restore both the system and its environment to their initial states. A steady state is one that does not change as a function of time despite interactions between the system and other systems in the environment.

2.1.1 The First Law of Thermodynamics Energy is a property of every system in any state. The first law of thermodynamics states that energy can be transferred to or from a system but can be neither created nor destroyed. The energy balance for a system can be expressed as

E E2 E1 Enet,in

(2.1a)

where denotes a finite change, subscripts 1 and 2 refer to the initial and final states, respectively, and Enet,in Ein Eout is the net amount of energy transferred into the system. For an infinitesimal change, the differential form of the energy balance is dE dEnet,in

(2.1b)

Here, d is used to signify a differential change of the property of a system, and d is used to specify a differentially small quantity that is not a property of any system. Clearly, the energy of an isolated system is conserved. Energy is an additive property, i.e., the energy of a composite system is the sum of the energies of all individual subsystems. Examples are kinetic energy and potential energy, as defined in classical mechanics, and internal energy, which will be discussed later. A similar expression for mass balance can also be written. The term mechanical effect is used for the kind of processes described in mechanics, such as the change of the height of a weight in a gravitational field, the change of the relative positions of two charged particles, the change of the velocity of a point mass, the change of the length of a spring, or a combination of such changes. All mechanical effects are equivalent in the sense that it is always possible to arrange forces and processes that annul all the mechanical effects except one that we choose. It is common to choose the rise and fall of a weight in a gravity field to represent this kind of processes. A cyclic process (also called a cycle) is one with identical initial and final states. A perpetual-motion machine of the first kind (PMM1) is any device (or system) undergoing a cyclic process that produces no external effects but the rise or fall of a weight in a gravity field. A PMM1 violates the first law of thermodynamics, and hence, it is impossible to build a PMM1. Perpetual motion, however, may exist as long as it produces zero net external effect. Examples of perpetual motion are a lossless oscillating pendulum, an electric current through a superconducting coil, and so forth.

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

27

2.1.2 Thermodynamic Equilibrium and the Second Law An equilibrium state is a state that cannot change spontaneously with time. There are different types of equilibrium: unstable, stable, and metastable. A stable-equilibrium state is a state that cannot be altered to a different state without leaving any net effect on the environment. In the following, a stable-equilibrium state is frequently referred to as a state at thermodynamic equilibrium. The stable-equilibrium-state principle, or state principle, can be phrased as follows: Among all states of a system with a given set of values of energy, parameters, and constituents, there exists one and only one stable-equilibrium state. That is to say that, in a stableequilibrium state, all properties are uniquely determined by the amount of energy, the value of each parameter, and the amount of each type of constituents. This principle is an integral part of the second law of thermodynamics.2,3,7 It is important for the thermodynamic definition of temperature and the derivation of thermodynamic relations in stable-equilibrium states. Another aspect of the second law of thermodynamics is the definition of an important property, called entropy, as discussed next. Entropy is an additive property of every system in any state. The second law of thermodynamics asserts that, in an isolated system, entropy cannot be destroyed but can either be created (in an irreversible process) or remain the same (in a reversible process). The entropy produced as time evolves during an irreversible process is called the entropy generation (Sgen ) of the process due to irreversibility. Like energy, entropy can be transferred from one system to another. One can write the entropy balance as follows (keeping in mind that entropy generation must not be negative):

S S2 S1 Snet,in Sgen with or with

Sgen 0

(2.2a)

dS dSnet,in dSgen dSgen 0

(2.2b)

Here again, d is used to indicate an infinitesimal quantity that is not a property of any system. For a system with fixed values of energy (E), parameters, and constituents, the entropy of the system is the largest in the stable-equilibrium state. This is the highest entropy principle. Applying this principle to an isolated system for which the energy is conserved, the entropy of the system will increase until a thermodynamic equilibrium is reached. Spontaneous changes of state are usually irreversible and accompanied by entropy generation. The second law of thermodynamics can be summarized with the following three statements: (1) There exists a unique stable-equilibrium state for any system with given values of energy, parameters, and constituents. (2) Entropy is an additive property, and for an isolated system, the entropy change must be nonnegative. (3) Among all states with the same values of energy, parameters, and constituents, the entropy of the stable-equilibrium state is the maximum. The energy of a system with volume (V) as its only parameter (neglecting other external forces) is called the internal energy (U). The state principle implies that there are r 2 (where r is the number of different constituents) independent variables that fully characterize a stable-equilibrium state of such a system. Therefore in a stable-equilibrium state, all properties are functions of r 2 independent variables. Since entropy is a property of the system, we have S S(U,V,N1,N2,c,Nr)

(2.3)

CHAPTER 2

28

where Ni is the number of particles of the ith species (or type of constituents). This function is continuous and differentiable, and furthermore, it is a monotonically increasing function of energy for fixed values of V and Nj’s.1,3,6 Equation (2.3) can be uniquely solved for U so that U U(S,V,N1,N2,c,Nr)

(2.4)

which is also continuous and admits partial derivatives of all orders. Each first order partial derivative of Eq. (2.3) or (2.4) represents a property of the stable-equilibrium state. For example, temperature and pressure are properties of a system at thermodynamic equilibrium. The (absolute) temperature is defined by T a

'U b 'S V,N

(2.5a)

j ’s

and the pressure is defined by P a

'U b 'V S,N

(2.5b) j ’s

The partial derivative with respect to the ith type of constituents defines its chemical potential of that species, mi a

'U b 'Ni S,V,N

(2.5c)

j ’s( j2i)

Equation (2.3) or (2.4) is called the fundamental relation for states at thermodynamic equilibrium. The differential form of Eq. (2.4) is the Gibbs relation: dU TdS PdV a midNi r

(2.6)

i1

where Eq. (2.5) has been used. The above equation may be rearranged into the form dS

r m i 1 P dU dV a dNi T T i1 T

(2.7)

Therefore, 'S 'S 1 P , and a b , a b T 'U V,N T 'V U,N j’s

j’s

mi 'S b a T 'Ni U,V,N

j ’s(

j2i)

(2.8)

An interaction between two systems that results in a transfer of energy without net exchanges of entropy and constituents is called a work interaction. The amount of energy transferred in such an interaction is called work (W). An interaction that has only mechanical effects is a work interaction, but a work interaction may involve nonmechanical effects. A process that involves only work interaction is called an adiabatic process. Another kind of a typical interaction is heat interaction, in which both energy and entropy are transferred without net exchanges of constituents and parameters between two systems. The amount of energy transferred in a heat interaction is called heat (Q). Furthermore, the amount of entropy transferred dS is equal to the amount of energy transferred dQ divided by the temperature TQ at which the heat interaction happens, i.e., dS dQ/TQ. If a system cannot exchange constituents with other systems, it is said to be a closed system; otherwise, it is an open system. Reversible processes are considered as the limiting cases of real processes, which are always accompanied by a certain amount of irreversibility. Such an ideal process is called

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

29

a quasi-equilibrium (or quasi-static) process, in which each stage can be made as close to thermodynamic equilibrium as possible if the movement is frictionless and very slow. In an ideal process, a finite amount of heat can be transferred reversibly from one system to another at a constant temperature. In practice, heat transfer can only happen when there is a temperature difference, and the process is always irreversible. A perpetual-motion machine of the second kind (PMM2) is a cyclic device that interacts with a system at thermodynamic equilibrium and produces no external effect other than the rise of a weight in a gravity field, without changing the values of parameters and the amounts of constituents of the system. Historically, there exist different statements of the second law of thermodynamics: The Kelvin-Planck statement of the second law is that it is impossible to build a PMM2. The Clausius statement of the second law is that it is not possible to construct a cyclic machine that will produce no effect other than the transfer of heat from a system at lower temperature to a system at higher temperature. These statements can be proved using the three statements of the second law of thermodynamics given earlier in this chapter. Example 2-1. Criteria for thermodynamic equilibrium. Consider a moveable piston (adiabatic and impermeable to matter) that separates a cylinder into two compartments (systems A and B), as shown in Fig. 2.1. We learned from mechanics that a mechanical equilibrium requires a balance of

A

B

FIGURE 2.1 Illustration of two systems that may exchange work, heat, and species.

forces on both sides of the piston, that is to say the pressure of system A must be the same as that of system B (i.e., PA PB). If the piston wall is made of materials that are diathermal (allowing heat transfer) and permeable to all species, under what conditions will the composite system C consisting of systems A and B be at stable equilibrium? Solution. Assume system C is isolated from other systems, and each of the subsystems A and B is at a thermodynamic equilibrium state, whose properties are solely determined by its internal energy, volume, and amount of constituents:UA, VA, Nj’s,A and UB, VB, Nj’s,B, respectively. There exist neighboring states for both subsystems with small differences in U, V, and Nj’s, but the values of the composite system must be conserved, i.e., dUA dUB, dVA dVB, and dNi,A dNi,B (i 1, 2, . . . r). The differential entropy of system C can be expressed as: dSC dSA dSB

r m r m PA PB i,A i,B 1 1 dUA

dVA a dNi,A

dUB

dVB a dNi,B TA TA TB TB i1 TA i1 TB

a

r mi,A mi,B PA PB 1 1 b dUA a b dVA a a b dNi,A TA TB TA TB T TB A i1

(2.9)

CHAPTER 2

30

If system C is in a stable-equilibrium state, its entropy is maximum and dSC 0. Since the values of dUA, dVA, and dNi,A are arbitrary, we must have PB 1 PA 1 , , and TA TB TA TB or

mi,A TA

mi,B TB

(i 1, 2,cr)

TA TB, PA PB, and mi,A mi,B (i 1, 2,cr)

(2.10)

These conditions correspond to thermal equilibrium, mechanical equilibrium, and chemical equilibrium, respectively. The combination forms the criteria for thermodynamic equilibrium. Discussion. In the case when the piston is diathermal but rigid and impermeable to matter, the entropy change of system C must be nonnegative, i.e., dSC dSA dSB a

1 1 bdUA 0 TA TB

(2.11)

The above expression implies that dUA 0 for TA TB, and dUA 0 for TA TB. Spontaneous heat transfer can occur only from regions of higher temperature to regions of lower temperature. This essentially proves the Clausius statement of the second law of thermodynamics.

The concept of thermal equilibrium provides the physical foundation for thermometry, which is the science of temperature measurement. The temperature of a system at a thermodynamic equilibrium state is measured through changes in resistance, length, volume, or other physical parameters of the sensing element used in the thermometer, which is brought to thermal equilibrium with the system. Based on the inclusive statement of the second law of thermodynamics given previously, it can be inferred that two systems are in thermal equilibrium with each other if they are separately in thermal equilibrium with a third system. This is sometimes referred to as the zeroth law of thermodynamics.6 The International Temperature Scale of 1990 (ITS-90) was adopted by the International Committee of Weights and Measures in 1989.8 The unit of thermodynamic temperature is kelvin (K), which is defined as 1/273.16 of the thermodynamic temperature of the triple point of water. The Celsius temperature is defined as the difference of the thermodynamic temperature and 273.15 K (the ice point). A difference of temperature may be expressed in either kelvins or degrees Celsius (°C). Although earlier attempts were made to define a temperature scale consistent with the original Celsius temperature scale (i.e., 0°C for the ice point and 100°C for the steam point), a 0.026°C departure arose from more accurate measurements of the steam point, as shown in Table 2.1.9 The steam point is therefore no longer

TABLE 2.1 Two-Phase Points and the Triple Point of Water Temperature

*

Ice point Triple point† Steam point‡ *

(K)

(°C)

273.15 273.16 373.124

0 0.01 99.974

Solid and liquid phases are in equilibrium at a pressure of 1 atm (101.325 kPa). Solid, liquid, and vapor phases are in equilibrium. ‡ Liquid and vapor phases are in equilibrium at 1 atm. †

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

31

used as a defining fixed point in the ITS-90. More accurate steam tables were developed in the 1990s. The ITS-90 defines 17 fixed points, which are determined by primary thermometry with standard uncertainties less than 0.002 K below 303 K and up to 0.05 K at the freezing point of copper ( [1 (1 e1)(1 e2)] sSBT 41 > [1/e1 1 > e2 1], and that from surface 2 to surface 1 is q 2S1 sSBT 24 > [1/e1 1/e2 1]. Subsequently, the net radiative flux from surface 1 to surface 2 is

qs12 qs1S 2 qs2S1

sSB(T 41 T 24) 1/e1 1/e2 1

(2.51)

Plugging in T1 873 K, T2 300 K, and other numerical values, we obtain q12 14,433 W/m2.

Gas emission, absorption, and scattering are important for atmospheric radiation and combustion. When radiation travels through a cloud of gas, some of the energy may be absorbed. The absorption of photons raises the energy levels of individual molecules. At sufficiently high temperatures, gas molecules may spontaneously lower their energy levels and emit photons. These changes in energy levels are called radiative transitions, which include bound-bound transitions (between nondissociated molecular states), bound-free transitions (between nondissociated and dissociated states), and free-free transitions (between dissociated states). Bound-free and free-free transitions usually occur at very high temperatures (greater than about 5000 K) and emit in the ultraviolet and visible regions. The most important transitions for radiative heat transfer are bound-bound transitions between vibrational energy levels coupled with rotational transitions. The photon energy (or frequency) must be exactly the same as the difference between two energy levels in order for the photon to be absorbed or emitted; therefore, the quantization of the energy levels results in discrete spectral lines for absorption and emission. The rotational lines superimposed on a vibrational line give a band of closely spaced spectral lines, called the vibration-rotation spectrum. Additional discussion will be given in Chap. 3 about quantized transitions in atoms and molecules. Particles can also scatter electromagnetic waves or photons, causing a change in the direction of propagation. In the early twentieth century, Gustav Mie developed a solution of Maxwell’s equations for scattering of electromagnetic waves by spherical particles, known as the Mie scattering theory which can be used to predict the scattering phase function. In the case when the particle sizes are small compared with the wavelength, the formulation reduces to the simple expression obtained earlier by Lord Rayleigh; and the phenomenon is called Rayleigh scattering, in which the scattering efficiency is inversely proportional to the wavelength to the fourth power. The wavelength-dependent characteristic of light scattering by small particles helps explain why the sky is blue and why the sun appears red at sunset. For spheres whose diameters are much greater than the wavelength, geometric optics can be applied by treating the surface as specular or diffuse.

CHAPTER 2

50

The spectral intensity in a participating medium, Il Il(j,,t), depends on the location (the coordinate ), its direction (the solid angle ), and time t. In a time interval dt, the beam travels from to d (d cdt), and the intensity is attenuated by absorption and outscattering, but enhanced by emission and in-scattering. The macroscopic description of the radiation intensity is known as the equation of radiative transfer (ERT).17 'Il sl 1 'Il c 't 'j alIb,l(T) (al sl)Il 4p 34p Il(j,r,t)l(r,)dr (2.52) where a and are the absorption and scattering coefficients, respectively, is the solid angle and direction of Il, and is the in-scattering solid angle and direction of Il(j,,t). Here, (,) is the scattering phase function ( 1 for isotropic scattering), which satisfies the equation: 14pl(r,)dr ; 4p. The right-hand side of Eq. (2.52) is composed of three terms: the first accounts for the contribution of emission (which depends on the local gas temperature T); the second is the attenuation by absorption and out-scattering; and the third is the contribution of in-scattering from all directions (solid angle 4) to the direction . Unless ultrafast laser pulses are involved, the transient term is negligible. The ERT for the steady state can be simplified as l 'Il(zl,) I (z ,r)l(r,)dr

Il(zl,) (1 l)Ib,l

4p 34p l l 'zl

(2.53)

where zl 10 (al sl)dj is the optical path length, and l sl/(al sl) is called the scattering albedo. This is an integro-differential equation, and its right-hand side is called the source function. The integration of the spectral intensity over all wavelengths and all directions gives the radiative heat flux. Unless the temperature field is prescribed, Eq. (2.53) is coupled with the heat conduction equation in a macroscopically stationary medium and the energy conservation equation in a fluid with convection. j

Analytical solutions of the ERT rarely exist for applications with multidimensional and nonhomogeneous media. Approximate models have been developed to deal with special types of problems, including Hottel’s zonal method, the differential and moment methods (often using the spherical harmonics approximation), and the discrete ordinates method. The statistical model using the Monte Carlo method is often used for complicated geometries and radiative properties.17 Analytical solutions can be obtained only for limited simple cases. Example 2-7. A gray, isothermal gas at a temperature Tg 3000 K occupies the space between two, large parallel blackbody surfaces. Surface 1 is heated to a temperature T1 1000 K, while surface 2 is maintained at a relatively low temperature by water cooling. It is desired to know the amount of heat that must be removed from surface 2. If the scattering is negligible, calculate the heat flux at surface 2 for alL 0.01, 0.1, 1, and 10, where L is the distance between the two surfaces. Solution. For a gray medium without scattering, Eq. (2.53) becomes (1/al)dI/dj I(j,) Ib(Tg), where u is the angle between j and x. With Ib(Tg) sSBTg4/p and I(0) Ib(T1) SBT 41/, the ERT can be integrated from x 0 to x L. The result is I(L,u) (sSB/p)T 41 exp(alL/cosu)

(sSB/p)T 4g [1 exp(alL/cosu)]. The radiative flux at x L can be obtained by integrating the intensity over the hemisphere, i.e., qs(alL) 3

2p

0

3

p/2 s

0

SB

4 4 4 alL/cos u D cos u sin ududf p CTg (Tg T1 )e

sSBTg4 2sSB(Tg4 T14 )E3(alL)

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

51

where E3(z) 10 ez>mdm is called the third exponential integral and can be numerically evaluated. The final results are tabulated as follows: 1

aL E3(aL) q (W/m2)

0.01 0.49 1.474 105

0.1 0.416 8.187 105

1 0.11 3.595 106

10 3.48 106 4.593 106

Discussion. In the optically thick limit (alL W 1), qs < sSBTg4, and all radiation leaving surface 1 will be absorbed by the gas before reaching surface 2. On the other hand, the heat flux is much greater than sSBT14 56.7 kW/m2 at alL 0.01. The gas absorption can be neglected in the optically thin limit; however, its emission contributes significantly to the radiative flux at surface 2. This is because the gas temperature is much higher than that of surface 1 and L/cos u can be much longer than L for large u values.

2.5 SUMMARY This chapter provided an overview of classical or equilibrium thermodynamics, derived following logical steps and on a general basis, as well as the functional relations and thermodynamic properties of simple systems and ideal pure substances. Built upon the foundations of thermodynamics, the basic heat transfer modes were elaborated in a coherent way. Entropy generation is inevitably associated with any heat transfer process. The connection between heat transfer and entropy generation, which has been omitted by most heat transfer textbooks, was also discussed. The introduction of thermal radiation not only covered most of the undergraduate-level materials but also presented some basic graduate-level materials. This chapter should serve as a bridge or a reference to the rest of the book, dealing with energy transfer processes in micro/nanosystems and/or from a microscopic viewpoint of macroscopic phenomena.

REFERENCES 1. H. B. Callen, Thermodynamics and an Introduction to Thermostatistics, 2nd ed., Wiley, New York, 1985. 2. G. N. Hatsopoulos and J. H. Keenan, Principles of General Thermodynamics, Wiley, New York, 1965; J. H. Keenan, Thermodynamics, Wiley, New York, 1941. 3. E. P. Gyftopoulos and G. P. Beretta, Thermodynamics: Foundations and Applications, Macmillan, New York, 1991; Also see the augmented edition, Dover Publications, New York, 2005. 4. R. E. Sonntag, C. Borgnakke, and G. J. van Wylen, Fundamentals of Thermodynamics, 5th ed., Wiley, New York, 1998. 5. M. J. Moran and H. N. Shapiro, Fundamentals of Engineering Thermodynamics, 4th ed., Wiley, New York, 2000. 6. A. Bejan, Advanced Engineering Thermodynamics, 2nd ed., Wiley, New York, 1997. 7. J. Kestin (ed.), The Second Law of Thermodynamics, Dowden, Hutchinson & Ross, Inc., Stroudsburg, PA, 1976. 8. H. Preston-Thomas, “The International Temperature Scale of 1990 (ITS-90),” Metrologia, 27, 3–10, 1990. 9. Z. M. Zhang, “Surface temperature measurement using optical techniques,” Annu. Rev. Heat Transfer, 11, 351–411, 2000. 10. M. Kaviany, Principles of Heat Transfer, Wiley, New York, 2002. 11. F. P. Incropera and D. P. DeWitt, Fundamentals of Heat and Mass Transfer, 5th ed., Wiley, New York, 2002. 12. M. N. Özi¸sik, Heat Conduction, 2nd ed., Wiley, New York, 1993.

CHAPTER 2

52

13. Y. S. Touloukian and C. Y. Ho (eds.), Thermophysical Properties of Matter—The TPRC Data Series (13 volumes compilation of data on thermal conductivity, specific heat, linear expansion coefficient, thermal diffusivity, and radiative properties), Plenum Press, New York, 1970–1977. 14. A. Bejan, Entropy Generation Minimization, CRC Press, Boca Raton, FL, 1996. 15. R. F. Barron, Cryogenic Heat Transfer, Taylor & Francis, Philadelphia, PA, 1999. 16. M. C. Potter and D. C. Wiggert, Mechanics of Fluids, Prentice Hall, New Jersey, 1991. 17. R. Siegel and J. R. Howell, Thermal Radiation Heat Transfer, 4th ed., Taylor & Francis, New York, 2002.

PROBLEMS Give examples of steady state. Give examples of thermodynamic equilibrium state. Give an example of spontaneous process. Is the growth of a plant a spontaneous process? Give an example of adiabatic process. 2.2. What is work? Describe an experiment that can measure the amount of work. What is heat? Describe an apparatus that can be used to measure heat. Are work and heat properties of a system? 2.3. Expand Eq. (2.1) and Eq. (2.2) in terms of the rate of energy and entropy change of an open system, which is subjected to work output, heat interactions, and multiple inlets and outlets of steady flow. 2.4. Discuss the remarks of Rudolf Clausius in 1867: (a) The energy of the universe is constant. (b) The entropy of the universe strives to attain a maximum value. 2.1.

For a cyclic device experiencing heat interactions with reservoirs at T1,T2, . . ., the Clausius inequality can be expressed as a dQi/Ti 0 or D dQ/T 0, regardless of whether the device pro

2.5.

i

duces or consumes work. Note that ␦Q is positive when heat is received by the device. Prove the Clausius inequality by applying the second law to a closed system. 2.6. In the stable-equilibrium states, the energy and the entropy of a solid are related by E 3 105 exp[(S S0)/1000], where E is in J, S is in J/K, and S0 is the entropy of the solid at a reference temperature of 300 K. Plot this relation in an E-S graph. Find expressions for E and S in terms of its temperature T and S0. 2.7. For an isolated system, give the mathematical expressions of the first and second laws of thermodynamics. Give graphic illustrations using E-S graph. 2.8. Place two identical metal blocks A and B, initially at different temperatures, in contact with each other but without interactions with any other systems. A thermal equilibrium is reached quickly. System C represents the combined system of both A and B. (a) Is the process reversible or not? Which system has experienced a spontaneous change of state? Which systems have experienced an induced change of state? (b) Assume that the specific heat of the metal is independent of temperature, cp 240 J/(kg K), the initial temperatures are TA1 800 K and TB1 200 K, and the mass of each block is 5 kg. What is the final temperature? What is the total entropy generation in this process? (c) Show the initial and final states of systems A, B, and C in a u-s diagram, and indicate which state is not an equilibrium state. Determine the adiabatic availability of system C in the initial state. 2.9. Two blocks made of the same material with the same mass are allowed to interact with each other but isolated from the surroundings. Initially, block A is at 800 K and block B at 200 K. Assuming that the specific heat is independent of temperature, show that the final equilibrium temperature is 500 K. Determine the maximum and minimum entropies that may be transferred from block A to block B. 2.10. A cyclic machine receives 325 kJ heat from a 1000 K reservoir and rejects 125 kJ heat to a 400 K reservoir in a cycle that produces 200 kJ work. Is this cycle reversible, irreversible, or impossible? 2.11. If z z(x,y), then dz fdx gdy, where f(x,y) z/x, g(x,y) z/y. Therefore, 'f 'g '2z '2z 'y 'y'x 'x'y 'x

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

53

The second-order derivatives of the fundamental equation and each of the characteristic function yield a Maxwell relation. Maxwell’s relations are very useful for evaluating the properties of a system in the stable-equilibrium states. For a closed system without chemical reactions, we have dNi ; 0. Show that Q

'V 'S 'V 'S 'T 'P 'P 'T R Q R V, Q R S Q R P, Q R T Q R V, and Q R T Q R P 'V S 'P 'V 'T 'P 'T 'S 'S

2.12. The isobaric volume expansion coefficient is defined as bP (1/v)('v/'T)P, the isothermal

compressibility is kT (1/v)('v/'P)T, and the speed of sound is va !('P/'r)s. For an ideal gas, show that P 1/T, T 1/P, and va !gRT .

2.13. For a system with single type of constituents, the fundamental relation obtained by experiments

gives S a(NVU)1/3, where ␣ is a positive constant, and N, V, S, and U are the number of molecules, the volume, the entropy, and the internal energy of the system, respectively. Obtain expressions of the temperature and the pressure in terms of N, V, U, and ␣. Show that S 0 at zero temperature for constant N and V. 2.14. For blackbody radiation in an evacuated enclosure of uniform wall temperature T, the energy density can be expressed as uv U/V (4/c)sSBT 4, where U is the internal energy, V the volume, c the speed of light, and SB the Stefan-Boltzmann constant. Determine the entropy S(T, V) and the pressure P(T, V ), which is called the radiation pressure. Show that the radiation pressure is a function of temperature only and negligibly small at moderate temperatures. Hint: T

1 'U S 3 Q R dT 0 T 'T V

and P T Q

'U 'S R Q R 'V T 'V T

2.15. A cyclic machine can only interact with two reservoirs at temperatures TA 298 K and TB

77.3 K, respectively. . (a) If heat is extracted from. reservoir A at a rate of Q 1000 W, what is the maximum rate of work that can be generated (Wmax)? (b) If no work is produced, what is the rate of entropy generation (S˙gen) of the cyclic machine? (c) Plot S˙gen versus W˙ (the power produced). 2.16. An engineer claimed that it requires much more work to remove 0.1 J of heat from a cryogenic chamber at an absolute temperature of 0.1 K than to remove 270 J of heat from a refrigerator at 270 K. Assuming that the environment is at 300 K, justify this claim by calculating the minimum work required for each refrigeration task. 2.17. A solid block [m 10 kg and cp 0.5 kJ/(kg K)], initially at room temperature (TA,1 300 K) is cooled with a large tank of liquid-gas mixture of nitrogen at TB 77.3 K and atmospheric pressure. (a) After the block reaches the liquid-nitrogen temperature, what is the total entropy generation (Sgen)? (b) Given the specific enthalpy of evaporation of nitrogen, hfg 198.8 kJ/kg, what must be its specific entropy of evaporation sfg in kJ/(kg K), in order for the nitrogen tank to be modeled as a reservoir? Does hfg Tsat sfg always hold? 2.18. Two same-size solid blocks of the same material are isolated from other systems [specific heat cp 2 kJ/(kg K); mass m 5 kg]. Initially block A is at a temperature TA1 300 K and block B at TB1 1000 K. (a) If the two blocks are put together, what will be the equilibrium temperature (T2) and how much entropy will be generated (Sgen)? (b) If the two blocks are connected with a cyclic machine, what is the maximum work that can be obtained (Wmax)? What would be the final temperature of the blocks (T3) if the maximum work were obtained? 2.19. A rock [density 2800 kg/m3 and specific heat cp 900 J/(kg K)] of 0.8 m3 is heated to 500 K using solar energy. A heat engine (cyclic machine) receives heat from the rock and rejects heat to the ambient at 290 K. The rock therefore cools down. (a) Find the maximum energy (heat) that the rock can give out. (b) Find the maximum work that can be done by the heat engine, Wmax. (c) In an actual process, the final temperature of the rock is 330 K and the work output from the engine is only half of Wmax. Determine the entropy generation of the actual process.

54

CHAPTER 2

2.20. Consider three identical solid blocks with a mass of 5 kg each, initially at 300, 600, and 900 K,

respectively. The specific heat of the material is cp 2000 J/(kg K). A cyclic machine is available that can interact only with the three blocks. (a) What is the maximum work that can be produced? What are the final temperatures of each block? Is the final state in equilibrium? (b) If no work is produced, i.e., simply putting the three blocks together, what will be the maximum entropy generation? What will be the final temperature? (c) If the three blocks are allowed to interact via cyclic machine but not with any other systems in the environment, what is the highest temperature that can be reached by one of the blocks? (d) If the three blocks are allowed to interact via cyclic machine but not with any other systems in the environment, what is the lowest temperature that can be reached by one of the blocks? 2.21. Electrical power is used to raise the temperature of a 500 kg rock from 25 to 500C. The specific heat of the rock material is cp 0.85 kJ/(kg K). (a) If the rock is heated directly through resistive (Joule) heating, how much electrical energy is needed? Is this process reversible? If not, how much entropy is generated in this process? (b) By using cyclic devices that can interact with both the rock and the environment at 25C, what is the minimum electrical energy required? 2.22. An insulated cylinder of 2 m3 is divided into two parts of equal volume by an initially locked piston. Side A contains air at 300 K and 200 kPa; side B contains air at 1500 K and 1 MPa. The piston is now unlocked so that it is free to move and it conducts heat. An equilibrium state is reached between the two sides after a while. (a) Find the masses in both A and B. (b) Find the final temperatures, pressures, and volumes for both A and B. (c) Find the entropy generation in this process. 2.23. A piston-cylinder contains 0.56 kg of N2 gas, initially at 600 K. A cyclic machine receives heat from the cylinder and releases heat to the environment at 300 K. Assume that the specific heat of N2 is cp 1.06 kJ/(kg K) and the pressure inside the cylinder is maintained at 100 kPa by the environment. What is the maximum work that can be produced by the machine? What is the thermal efficiency (defined as the ratio of the work output to the heat received)? The thermodynamic efficiency can be defined as the ratio of the actual work produced to the maximum work. Plot the thermodynamic efficiency as a function of the entropy generation. What is the maximum entropy generation? 2.24. An air stream [cp 1 kJ/(kg K) and M 29.1 kg/kmol] flows through a power plant. The stream enters a turbine at T1 750 K and P1 6 MPa, and exits at P2 1.2 MPa into a recovery unit, which can exchange heat with the environment at 25C and 100 kPa. The stream then exits the recovery unit to the environment. The turbine is thermally insulated and has an efficiency t 0.85. (a) Find the power per unit mass flow rate produced by the turbine. (b) Calculate the entropy generation rate in the turbine. (c) Determine the largest power that can be produced by the recovery unit. 2.25. Water flows in a perfectly insulated, steady state, horizontal duct of variable cross-sectional area. Measurements were taken at two ports and the data were recorded in a notebook as follows. For port 1, speed j1 3 m/s, pressure P1 50 kPa, and temperature T1 40C; for port 2, j2 5 m/s and P2 45 kPa. Some information was accidentally left out by the student taking the notes. Can you determine T2 and the direction of the flow based on the available information? Hint: Model the water as an ideal incompressible liquid with cp 4.2 kJ/(kg K) and specific volume v 103 m3/kg. 2.26. An insulated rigid vessel contains 0.4 kmol of oxygen at 200 kPa separated by a membrane from 0.6 kmol of carbon dioxide at 400 kPa; both sides are initially at 300 K. The membrane is suddenly broken and, after a while, the mixture comes to a uniform state (equilibrium). (a) Find the final temperature and pressure of the mixture. (b) Determine the entropy generation due to irreversibility. 2.27. Pure N2 and air (21% O2 and 79% N2 by volume), both at 298 K and 120 kPa, enter a chamber at a flow rate of 0.1 and 0.3 kmol/s, respectively. The new mixture leaves the chamber at the same temperature and pressure as the incoming streams. (a) What are the mole fractions and the mass fractions of N2 and O2 at the exit? (b) Find the enthalpy change in the mixing process. Find the entropy generation rate of the mixing process. (c) Consider a process in which the flow directions are reversed. The chamber now contains necessary devices for the separation, and it may transfer heat to the environment at 298 K. What is the minimum amount of work per unit time needed to operate the separation devices?

OVERVIEW OF MACROSCOPIC THERMAL SCIENCES

55

2.28. A Carnot engine receives energy from a reservoir at TH and rejects heat to the environment at

T0 via a heat exchanger. The engine works reversibly between TH and TL, where TL is the temperature of the higher-temperature side of the heat exchanger. The product of the area and the heat transfer coefficient of the heat exchanger is ␣. Therefore, the heat that must be rejected to the environment through the . heat exchanger is QL a(TL T0). Given TH 800 K, T0 300 K, and a 2300 W/K. Determine the value of TL so that the heat engine will produce maximum work, and calculate the power production and the entropy generation in such a case. 2.29. To measure the thermal conductivity, a thin-film electric heater is sandwiched between two plates whose sides are well insulated. Each plate has an area of 0.1 m2 and a thickness of 0.05 m. The outside of the plates are exposed to air at T` 25C with a convection coefficient of h 40 W/(m2 K). The electric power of the heat is 400 W and a thermocouple inserted between the two plates measures a temperature of T1 175C at steady state. Determine the thermal conductivity of the plate material. Find the total entropy generation rate. Comment on the fraction of entropy generation due to conduction and convection. 2.30. An electric current, I 2 A, passes through a resistive wire of diameter D 3 mm with a resistivity re 1.5 104 # m. The cable is placed in ambient air at 27C with a convection coefficient h 20 W/(m2 K). Assume a steady state has been reached and neglect radiation. Determine the radial temperature distribution inside the wire. Determine the volumetric entropy generation rate s˙gen as a function of radius. Determine the total entropy generation rate per unit length of the cable. Hint: For . steady-state conduction, sgen (1/T )= # qs (1/T 2) (qs # =T ). 2.31. Find the thermal conductivity of intrinsic (undoped) silicon, heavily doped silicon, quartz, glass, diamond, graphite, and carbon from 100 to 1000 K from Touloukian and Ho.13 Discuss the variations between different materials, crystalline structures, and doping concentrations. 2.32. Find the thermal conductivity of copper from 1 to 1000 K from Touloukian and Ho.13 Discuss the general trend in terms of temperature dependence, and comment on the effect of impurities. 2.33. For laminar flow over a flat plate, the velocity and thermal boundary layer thicknesses can be Pr1/3, respectively. Use room temperature data to calculated by d(x) 5xRe1/2 and dt(x) 5xRe1/2 x x calculate and plot the boundary layer thicknesses for air, water, engine oil, and mercury for different values of U` . Discuss the main features. Hint: Property data can be found from Incropera and DeWitt.11 2.34. Air at 14C and atmospheric pressure is in parallel flow over a flat plate of 2 2 m2. The air velocity is 3 m/s and the surface is maintained at 140C. Determine the average convection coefficient and the rate of heat transfer from the plate to air. (For air at 350 K, which is the average temperature between the surface and fluid, k 0.03 W/(m # K), n 20.9 106 m2/s, and Pr 0.7.) 2.35. Plot the blackbody intensity (Planck’s law) as a function of wavelength for several temperatures. Discuss the main features of this function. Show that in the long-wavelength limit, the blackbody function can be approximated by eb,l(l,T) < pckBT>l4, which is the Rayleigh-Jeans formula. 2.36. Calculate the net radiative heat flux from the human body at a surface temperature of Ts 308 K , with an emissivity e 0.9, to the room walls at 298 K. Assume air at 298 K has a natural convection coefficient of 5 W/(m2 K). Neglect evaporation, calculate the natural convection heat flux from the person to air. Comment on the significance of thermal radiation. 2.37. A combustion fired in a spherical enclosure of diameter D 50 cm with a constant wall temperature of 600 K. The temperature of the combustion gas may be approximated as uniform at 2300 K. The absorption coefficient of the gas al 0.01 cm1, which is independent of wavelength. Assuming that the wall is black and neglecting the scattering effect, determine the net heat transfer rate between the gas and the inner wall of the sphere.

This page intentionally left blank

CHAPTER 3

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

Classical statistical mechanics is based on the assumption that all matters are composed of a myriad of small discrete particles, such as molecules and atoms, in any given macroscopic volume.1–5 There are about N 2.5 1016 molecules per cubic millimeter of air at standard conditions (25°C and 1 atm). These particles are in continuous random motion, which generally obeys the laws of classical mechanics. A complete microscopic description of a system requires the identification of the position ri(t) and velocity vi(t) of each particle (here, subscript i indicates the ith particle) at any time. For a simple system of N molecules in a box of volume V, one can write Newton’s law of motion for each molecule as dvi a Fij(ri,rj,t) mi dt , i 1, 2, c, N j

(3.1)

where Fij is the intermolecular force that the jth molecule exerts on the ith molecule, and mi is the mass of the ith molecule. The initial position and velocity, as well as the nature of collisions among particles and that between particles and the walls of the box, must be specified in order to solve the N equations. Although this approach is straightforward, there are two major barriers. First, the intermolecular forces or potentials are often complicated and difficult to determine. Second, the solution of Eq. (3.1) requires significant computer resources even for rather simple problems. Statistical methods are often used instead to obtain microscopic descriptions that are related to macroscopic behaviors. Statistical mechanics aims at finding the equilibrium distribution of certain types of particles in the velocity space. It provides a linkage between macroscopic thermodynamic properties and the microscopic behavior and a means to evaluate some thermodynamic properties. Kinetic theory, on the other hand, deals with nonequilibrium processes. It gives a microscopic description of transport phenomena and helps predict some important transport properties, as will be seen in Chap. 4. Along with the rapid development in computing speed and memory, molecular dynamics (MD) simulation has become a powerful tool for the investigation of phenomena occurring in nanostructures and/or at very short time scales. In the MD method, the location and the velocity of every particle are calculated at each time step by applying Eq. (3.1) with a suitable potential function.6,7 Thermodynamic properties are then evaluated using statistical mechanics formulation. Further discussion about the application of MD simulation to predict the thermal properties of nanostructures will be given in Chap. 7. This chapter starts with a statistical model of independent particles and a brief introduction to the basic principles of quantum mechanics. The necessary mathematical background is summarized in Appendix B. It is highly recommended that one review the materials covered in the appendix before studying this chapter. The three important distributions are derived 57 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

58

CHAPTER 3

based on the statistics for different types of particles. The microscopic descriptions and results are then linked to macroscopic quantities and the laws of thermodynamics. The application to ideal gases is presented at the end of this chapter, while discussions of blackbody radiation, lattice vibrations, and free electron gas will be deferred to later chapters.

3.1 STATISTICAL MECHANICS OF INDEPENDENT PARTICLES We say particles are independent when their energies are independent of each other and the total energy is the sum of the energies of individual particles. Consider a system that has N independent particles of the same type confined in a volume V. The total internal energy of the system is U, which is the sum of the energies of all particles. Particles may have different energies and can be grouped according to their energies. It is of interest to know how many particles are there within certain energy intervals. We can subdivide energy into a large number of discretized energy levels. As illustrated in Fig. 3.1, there are Ni particles on the ith energy level, each with energy exactly equal to ei.

FIGURE 3.1 Illustration of (a) a simple system of independent particles and (b) energy levels.

From the classical mechanics point of view, it appears that the increment between adjacent energy levels can be indefinitely small. The particles are distinguishable, and there is no limit on the number of particles on each energy level. Quantum mechanics predicts that the energy levels are indeed discretized with finite increments between adjacent energy levels, and the particles are unidentifiable (indistinguishable). An introduction to the basic principles of quantum mechanics is given in Sec. 3.1.3 and a more detailed introduction of the quantum theory is given near the end of this chapter. The conservation equations for the system shown in Fig. 3.1 are a Ni N

(3.2)

a eiNi U

(3.3)

`

i0 `

and

i0

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

59

3.1.1 Macrostates versus Microstates The thermodynamic state may be viewed in terms of the gross behavior that ignores any differences at the molecular or atomic level, or in terms of the individual particles. A macrostate is determined by the values of N0, N1, N2, c for a given volume (which somehow confines the quantized energy levels) though two different macrostates can have the same energy. Each macrostate may be made up of a number of microscopic arrangements; each microscopic arrangement is called a microstate. In statistical mechanics, all microstates are assumed equally probable. There may be a large number of microstates that correspond to the same macrostate. The number of microstates for each macrostate is termed the thermodynamic probability of that macrostate. Unlike the stochastic probability that lies between 0 and 1, the thermodynamic probability is usually a very large number. One of the principles underlying statistical mechanics is that the stable-equilibrium state corresponds to the most probable macrostate. Therefore, for given values of U, N, and V, the thermodynamic probability is the largest in the stable-equilibrium state. We will use the following example to illustrate the concepts of microstate and macrostate. There are four distinguishable particles in a confined space, and there are two energy levels. How many macrostates are there? How many microstates are there for the macrostate with two particles on each energy level? Example 3-1.

There are five macrostates in total with (N1, N2) (0, 4), (1, 3), (2, 2), (3, 1), and (4, 0), respectively. Because the particles are distinguishable, the microstates will be different only if the particles from different energy levels are interchanged. Using the combination theory, we can figure out that (N1,N2) N!/(N1! N2!) 4!/(2!2!) 6, i.e., there are six microstates for the macrostate with two particles on each energy level. It can be shown that this is also the most probable macrostate.

Solution.

3.1.2 Phase Space The phase space is a six-dimensional space formed by three coordinates for the position r and three coordinates for the momentum p mv or velocity v. Each point in the phase space defines the exact location and momentum of an individual particle. If both the space and the momentum are described with the Cartesian system, then a volume element in the phase space is dxdydzdpxdpydpz. Figure 3.2 shows a phase space projected to the x-px plane.

FIGURE 3.2 Phase space projected to the x-p x plane, where

x px is an area element.

60

CHAPTER 3

The three coordinates (px,py,pz) form a momentum space. One may choose to use (vx,vy,vz) to form a velocity space. If the momentum space is described in spherical coordinates, the volume element is dpxdpydpz p2 sin dpdd. The volume contained in a spherical shell from p to p dp is 4pp2dp. Figure 3.3 illustrates the momentum space projected to the px-py plane, with a spherical shell.

FIGURE 3.3 The p x-p y plane of the momentum space, showing a spherical shell.

3.1.3 Quantum Mechanics Considerations The origin of quantum theory can be traced back to about 100 years ago when Planck first used a discrete set of energies to describe the electromagnetic radiation, and thus obtained Planck’s distribution (details to be presented in Sec. 8.1). For any given frequency of radiation n, the smallest energy increment is given by hn, where h 6.626 1034 J # s is called Planck’s constant. Radiation can be alternatively viewed as electromagnetic waves or traveling energy quanta. The corpuscular theory treats radiation as a collection of energy quanta, called photons. The energy of a photon is given by e hn

(3.4)

From the wave theory, the speed of light c is related to the wavelength l and the frequency by c ln

(3.5)

In a medium with a refractive index of n, c c0 /n and l l0 /n, where subscript 0 is used to indicate quantities in vacuum with n 1. The speed of light in vacuum is c0 299, 792, 458 m/s, which is a defined quantity as given in Appendix A. Note that frequency does not change from one medium to another. Based on the relativistic theory, the rest energy E0 of a particle with mass m is E0 mc2

(3.6)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

61

The momentum of the particle traveling with speed v is p mv. Since the energy of a photon is hn and its speed is c, the momentum of a (massless) photon is (see Sec. 3.7) hn h p c l

(3.7)

Another hypothesis of quantum theory is that the motion of matter may be wavelike, with characteristic wavelength and frequency. Therefore, for a particle moving with velocity v V c, h h lDB p mv

and

nDB

mc2 h

(3.8)

which are called de Broglie wavelength and de Broglie frequency, respectively. In 1923, Louis de Broglie postulated that matter may also possess wave characteristics and thereafter resolved the controversy as per the nature of radiation. Note that the phase speed of the wave defined by Eq. (3.8) is c2/v, which is greater than the speed of light. The discovery of electron diffraction confirmed de Broglie’s hypothesis. For this prediction, de Broglie received the Nobel Prize in physics in 1929. Seven years later, the 1937 Nobel Prize in physics was shared by Clinton J. Davisson and George P. Thomson for their independent experiments that demonstrated diffraction of electrons by crystals. Example 3-2. Calculate the frequency in Hz and photon energy in eV of an ultraviolet (UV) laser

beam at a wavelength of l 248 nm and a microwave at l 10 cm. Calculate the de Broglie wavelength of an He atom at 200C, using the average speed of 1717 m/s, and an electron traveling with a speed of 106 m/s. Solution. The equations are c/ and hc. Assume the refractive index is 1. For the UV

beam at 248 nm, 1.2 1015 Hz and 8.01 1019 J 5 eV. For 10 cm, 3 109 Hz 3 GHz and 2 1024 J 1.24 105 eV 124 meV. The mass of an He atom is m M/NA 6.64 1027 kg. Hence, DB h/mv 5.8 1011 m 58 pm. From Appendix A, me 9.11 1031 kg, therefore, lDB 7.3 1010 m 0.73 nm, which is in the x-ray region.

The foundation of quantum mechanics is the Schrödinger equation, which is a partialdifferential equation of the time-space dependent complex probability density function. More details can be found from Tien and Lienhard,1 Carey,5 and Griffiths.8 The solutions of the Schrödinger equation support the dual nature of wave and matter, and result in discrete quantized energy levels. Furthermore, there are usually more than one distinguishable quantum state at each energy level, i.e., the energy levels may be degenerate. The number of quantum states for a given energy level is called the degeneracy, denoted by gi for the ith energy level, as shown in Fig. 3.4.

FIGURE 3.4 The degeneracy of the ith energy level.

The uncertainty principle states that the position and momentum of a given particle cannot be measured simultaneously with arbitrary precision. The limit is given by

x px h/4p

(3.9)

CHAPTER 3

62

This result implies that we cannot locate the exact position of a particle in the phase space; all we can say is that the particle is somewhere in a domain whose volume is around h3. The uncertainty principle is one of the cornerstones of quantum mechanics and was formulated in 1927 by Werner Heisenberg, a Nobel Laureate in Physics. In quantum theory, independent particles of the same type are indistinguishable. For certain particles, such as electrons, each quantum state cannot be occupied by more than one particle. This is the Pauli exclusion principle, discovered by Nobel Laureate Wolfgang Pauli in 1925. The result, as we will see, is the Fermi-Dirac statistics that can be used to describe the behavior of free electrons. The collection of free electrons in metals is sometimes called the free electron gas, which exhibits very different characteristics from ideal molecular gases.

3.1.4 Equilibrium Distributions for Different Statistics The characteristics of various types of particles can be described by different statistics. In this section, we will first introduce three statistics and then apply them to obtain the distribution functions, i.e., the number of particles on each energy level. The application of the distribution functions to the study of thermodynamic properties of ideal molecular gases will be discussed later in this chapter. The applications of statistical thermodynamics to blackbody radiation, lattice vibration, free electrons in metals, and electrons and holes in semiconductors will be discussed in subsequent chapters. • The Maxwell-Boltzmann (MB) statistics: Particles are distinguishable and there is no limit for the number of particles on each energy level. From Eq. (B.22) in Appendix B, the thermodynamic probability for the distribution shown in Fig. 3.1b is

N! N0!N1!N2! c

N! q Ni! `

i0

If degeneracy is included as shown in Fig. 3.4, then gNi i N! i0 i

MB N! q `

(3.10)

• The Bose-Einstein (BE) statistics: Particles are indistinguishable and there is no limit for the number of particles in each quantum state; there are gi quantum states on the ith energy level. From Eq. (B.23), the number of ways of placing Ni indistinguishable objects (gi Ni 1)! to gi distinguishable boxes is . Therefore, the thermodynamic probability (gi 1)!Ni! for BE statistics is ` (g N 1)! i i BE q (gi 1)!Ni! i0

(3.11)

• The Fermi-Dirac (FD) statistics: Particles are indistinguishable and the energy levels are degenerate. There are gi quantum states on the ith energy level, and each quantum state can be occupied by no more than one particle. Using Eq. (B.21), we obtain the thermodynamic probability for FD statistics as ` gi! FD q (g Ni)!Ni! i i0

(3.12)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

63

The three statistics are very important for understanding the molecular, electronic, crystalline, and radiative behaviors that are essential for energy transport processes in both small and large scales. MB statistics can be considered as the limiting case of BE or FD statistics. The thermodynamic relations and the velocity distribution of ideal molecular gases can be understood from MB statistics. BE statistics is important for the study of photons, phonons in solids, and atoms at low temperatures. It is the basis of Planck’s law of blackbody radiation, the Debye theory for the specific heat of solids, and the Bose-Einstein condensation, which is important for superconductivity, superfluidity, and laser cooling of atoms. FD statistics can be used to model the electron gas and the electron contribution to the specific heat of solids. It is important for understanding the electronic and thermal properties of metals and semiconductors. Example 3-3. Four indistinguishable particles are to be placed in two energy levels, each with a degeneracy of 3. Evaluate the thermodynamic probability of all arrangements, considering BE and FD statistics separately. What are the most probable arrangements?

There are two energy levels, g0 g1 3 and the total number of particles N 4. The thermodynamic probability is 0 1, which depends on N0 and N1 (N0 N1 4). Figure 3.5 shows specific cases of the BE and FD distributions.

Solution.

FIGURE 3.5 Illustration of the arrangement for four particles on two energy levels, each with a degeneracy of 3. (a) Bose-Einstein statistics. (b) Fermi-Dirac statistics.

For BE statistics, we have BE

(N0 g0 1)! (N1 g1 1)! (g0 1)!N0! (g1 1)! N1! (N0 2)(N0 1) (6 N0)(5 N0) 2 2

For FD statistics, we must have Ni gi; therefore, 1 N0 3, and FD

g0! g 1! 6 6 (g0 N0)! N0! (g1 N1)! N1! (3 N0)! N0! (N0 1)! (4 N0)!

The results are summarized in the following table. Clearly, the most probable arrangement for both statistics in this case is N0 N1 2. N0 N1 BE FD

0 4 15 –

1 3 30 3

2 2 36 9

3 1 30 3

4 0 15 –

For a given simple thermodynamics system of volume V, internal energy U, and total number of particles N, we wish to find the state (identified by the distribution N0,N1,N2, c) that maximizes or ln , under the constrains given by Eq. (3.2) and Eq. (3.3), based on

CHAPTER 3

64

the method of Lagrange multipliers (Appendix B). For MB statistics with degeneracy, from Eq. (3.10), ln ln N! a Ni ln gi a ln Ni! `

`

i0

i0

For a large number of particles, the Stirling formula gives ln N! < N ln N N from Eq. (B.11). The above equation can be approximated as ln < N ln N N a Ni ln gi a (Ni ln Ni Ni) `

`

i0

i0

gi N ln N N a Ni aln 1b N `

i

i0

Notice that N and gi’s are fixed and only Ni’s are variables, therefore, ` ` ` gi gi '(ln ) 1 dNi < a aln 1 Ni b dNi a ln dNi 0 (3.13) d(ln ) a 'N N N N i i i i i0 i0 i0

From the constraint equations, Eq. (3.2) and Eq. (3.3), we have a a dNi 0

(3.14a)

b a eidNi 0

(3.14b)

`

i0 `

and

i0

where ␣ and  are Lagrangian multipliers and ei’s are treated as constants. Conventionally, negative signs are chosen because a and b are generally nonnegative for molecular gases. By adding Eq. (3.14a) and Eq. (3.14b) to Eq. (3.13), we obtain ` gi a (ln N a bei)dNi 0

i0

i

Because dNi can be arbitrary, the above equation requires that ln (gi/Ni) a bei 0. Hence, Ni

or

gi eaebei

gieaebei

Ni N

a be a gie e i `

(3.15a)

(3.15b)

i0

This is the MB distribution. The physical meanings of a and b will be discussed later. Using the same procedure described above, we can obtain the following for BE statistics, Ni

gi ee

a bei

1

(3.16)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

65

which is the BE distribution. For FD statistics, we can obtain the FD distribution as follows Ni

gi

(3.17)

eaebei 1

The results for all the three statistics are summarized in Table 3.1. Derive the BE distribution step by step. Under which condition can it be approximated by the MB distribution?

Example 3-4. Solution.

Using the thermodynamic probability of BE statistics in Eq. (3.11), we have ln a [ln(gi Ni 1)! ln(gi 1)! ln Ni!] `

i0

< a [(gi Ni 1) ln(gi Ni 1) (gi Ni 1) `

i0

(gi 1) ln(gi 1) (gi 1) Ni ln Ni Ni] a [(gi Ni 1) ln(gi Ni 1) (gi 1) ln(gi 1) Ni ln Ni] `

i0

Hence,

' ln 1 1 ln Ni Ni < ln (gi Ni 1) (gi Ni 1) 'Ni Ni gi Ni 1 ln a

gi Ni 1 gi b < ln a 1b , Ni Ni

since

Ni 1

To maximize , we set d(ln ) 0, i.e., ` ` '(ln ) gi dNi < a ln a 1bdNi 0 d(ln ) a 'N N i i i0 i0

By adding Lagrangian multipliers, Eq. (3.14a) and Eq. (3.14b), we have a [ln (gi /Ni 1) ␣ i] `

i0

dNi 0. Hence, Ni gi /(eaebei 1), which is the BE distribution given in Eq. (3.16) and Table 3.1. If exp(␣ i) W 1, Eq. (3.16) and Eq. (3.17) reduce to the MB distribution, Eq. (3.15a). Under the limiting case of gi W Ni W 1, we have Ni terms 7

(gi Ni 1)! (gi Ni 1) c (gi 1)gi gi Ni 1 gNi i h (gi 1)!Ni! Ni! Ni! Ni terms 7

and

gi! gi(gi 1) c (gi Ni 1) gi Ni 1 gNi i h N! (gi Ni)!Ni! Ni! i

That is to say that the thermodynamic probability for both the BE and FD statistics reduces to the MB statistics divided by N!, which is caused by the assumption of indistinguishable particles. Therefore, ` gNi MB i MB,corrected q N ! N! i0 i

(3.18)

TABLE 3.1 Summary of the Three Statistics Statistics

Maxwell-Boltzmann (MB)

Bose-Einstein (BE)

Fermi-Dirac (FD)

Name of particles Examples

Boltzons Ideal gas molecules & in the limit of bosons and fermions Distinguishable Degenerate Unlimited ` g Ni i N! q N i! i0

Bosons Photons & phonons

Fermions Electrons & protons

Indistinguishable Degenerate Unlimited ` (g N 1)! i i q (g 1)!N ! i i i0

Indistinguishable Degenerate One ` gi! q (g N )! N ! i i i i0

Distinguishability Degeneracy Particles per quantum state Thermodynamic probability In the limit of gi W Ni

MB (given above)

MB/N!

MB/N!

N ln N N

a [(gi Ni 1) ln (gi Ni 1)

a [(gi ln gi Ni ln Ni

`

66

ln

i0

a Ni[ln(gi/Ni) 1]

`

i0

`

i0 `

gi a lna N bdNi

d(ln )

i

i0

a a dNi b a eidNi `

`

i0

i0

Distribution function Ni Applications

gi lna b a bei 0 Ni

Ni ln Ni (gi 1) ln(gi 1)]

(gi Ni) ln (gi Ni)]

gi a lna N 1bdNi

` gi a lna N 1bdNi

`

i0

i

gi lna 1b a bei 0 Ni

i0

i

gi lna 1b a bei 0 Ni

gi

gi

gi

eaebei

eaebei 1

eaebei 1

Ideal gases; Maxwell’s velocity distribution; limiting cases of BE and FD statistics

Planck’s law; Bose-Einstein condensation; specific heat of solids

Electron gas; Fermi level; electron specific heat in metals

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

67

is called the “corrected” MB statistics. For ideal molecular gases at reasonably high temperatures, gi W Ni. For this reason, the MB distribution may be considered as the limiting case of the BE or FD distribution (see Table 3.1).

3.2 THERMODYNAMIC RELATIONS The thermodynamic properties and relations can be understood from the microscopic point of view. This includes the concept of heat and work, entropy, and the third law of thermodynamics. The partition function is key to the evaluation of thermodynamic properties.

3.2.1 Heat and Work From Eq. (3.3), we have dU a eidNi a Nidei `

`

i0

i0

(3.19a)

The first term on the right is due to a redistribution of particles among the energy levels (which is related to a change in entropy), while the second is due to a shift in the energy levels associated with, e.g., a volume change. Consider a reversible quasi-equilibrium process for a closed system (such as a piston/cylinder arrangement). The work is associated to the volume change that does not change the entropy of the system, while heat transfer changes entropy of the system without affecting the energy levels. Therefore, dQ a eidNi and dW a Nidei `

`

i0

i0

(3.19b)

In writing the above equation, dQ is positive for heat transferred to the system, and ␦W is positive for work done by the system. They are related to macroscopic quantities for simple systems by dQ TdS and dW PdV . Hence, we obtain the expression of the first law for a closed system, dU dQ dW . If the system is an open system, then a eidNi dU dW 2 dQ. `

i0

3.2.2 Entropy The macroscopic property entropy is related to the thermodynamic probability by S kBln

(3.20)

where kB is the Boltzmann constant. Consider two separate systems A and B, and their combination as a system C. At a certain time, both A and B are individually in thermodynamic equilibrium. Denote the states as A1 and B1, and the combined system as state C1. The thermodynamic probability of system C at state C1 is related to those of A1 and B1 by C1 A1 B1

CHAPTER 3

68

The entropy of C1 is then SC1 kB ln C1 kB ln(A1 B1) kB ln A1 kB ln B1 SA1 SB1 Therefore, this definition of entropy meets the additive requirement. The highest entropy principle states that the entropy of an isolated system will increase until it reaches a stable-equilibrium state (thermodynamic equilibrium), i.e., Sisolated 0. The microscopic understanding is that entropy is related to the probability of occurrence of a certain macrostate. For a system with specified U, N, and V, the macrostate that corresponds to the thermodynamic equilibrium is the most probable state and, hence, its entropy is the largest. Any states, including those that deviate very slightly from the stable-equilibrium state, will have a much smaller thermodynamic probability. After the equilibrium state is reached, it is not possible for any macrostate, whose thermodynamic probability is much less than that of the equilibrium state, to occur within an observable amount of time.

3.2.3 The Lagrangian Multipliers For all three types of statistics, d(ln ) a a dNi b a eidNi, where the first term `

`

i0

i0

is the change in the total number of particles and the second can be related to the net heat transfer for a closed system; therefore, d(ln ) adN bdQ. In a reversible process in which the total number of particles do not change (closed system), dN 0, d(ln ) dS/kB, and dQ TdS. Hence, we have for all three statistics b;

1 kBT

(3.21)

To evaluate ␣, we must allow the system to change its composition. In this case, d(ln ) a a dNi b a eidNi adN b(dU PdV) or

`

`

i0

i0

TdS kBTadN dU PdV

Substituting the above equation into the definition of the Helmholtz function, dA d(U TS) dU TdS SdT, we have dA SdT PdV kBTadN Noting that the chemical potential m ('A>'N)T,V kBTa, we obtain a

m kBT

(3.22)

where is expressed in molecular quantity, and a m/RT if m is expressed in molar quantity. 3.2.4 Entropy at Absolute Zero Temperature The third law of thermodynamics states that the entropy of any pure substance vanishes at the ground state (with absolute zero temperature); see Sec. 2.1.3. For BE statistics, we have g0 g1 g2 N N0 N1 N2 c 5 a be

a be

a be

c 0 1 1 1 2 1 e e e

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

69

At very low temperatures (T S 0), b 1/kBT S ` . Since e0 e1 e2 c, Ni gi < g eb(eie0) S 0 as T S 0 for i 1 N0 0

(3.23)

Hence, N < N0; that is, all particles will be at the lowest energy level (ground state). If g0 1, as it is the case for a pure substance, then 1 and S kB ln 0 as T S 0; this is consistent with the third law of thermodynamics. The occurrence for particles that obey BE statistics (bosons) to collapse to the ground state at sufficiently low temperatures is called the Bose-Einstein condensation. Such a state of matter is called the Bose-Einstein condensate, in which quantum effects dominate the macroscopic behavior. Some important applications of the Bose-Einstein condensation are superfluidity and superconductivity. Liquid helium (4He) becomes a superfluid with no viscosity at temperatures below the -transition (T < 2.17 K). The specific heat of helium at this temperature becomes infinitely large, suggesting that a phase transition occurs. Bose-Einstein condensate of atoms has been observed with laser cooling and trapping techniques.9 Photons from the laser collide with the atoms. The absorption can be tuned using the Doppler shift so that only atoms traveling toward the laser can absorb the photons, resulting in reduced momentums in these atoms. Furthermore, the excited atoms will emit photons spontaneously in all directions. The net effect is a decrease in the velocity of the atoms, resulting in a kinetic temperature down to the nanokelvin range. In the last decade, the Nobel Prize in Physics was awarded for works related to the Bose-Einstein condensation four times: 1996, 1997, 2001, and 2003. Although electrons are fermions (particles that obey FD statistics) that generally do not condense at zero temperature, they can form pairs at sufficiently low temperatures that behave like bosons. Below the critical temperature, pairs of electrons, called the Cooper pairs can travel freely without any resistance. This is the phenomenon called superconductivity, which was discovered at the beginning of the twentieth century. A large number of elements and compounds can be made superconducting at very low temperatures. Furthermore, some oxides become superconducting at temperatures above 90 K.10 Superconductors have important applications in magnetic resonance imaging, high-speed and low-noise electronic devices, infrared sensors, and so forth. A similar phenomenon is the superfluidity in helium isotope 3 He, which undergoes a phase transition at very low temperatures. The fermionic 3He atoms pair up to form bosonic entities that experience Bose-Einstein condensation at 3 mK. For FD statistics, from Eq. (3.17), Eq. (3.21), and Eq. (3.22), we have Ni 1 gi e(eim)/kBT 1

(3.24)

As T S 0, it is found that Ni/gi 1 for all energy levels with ei m and Ni /gi 0 for energy levels with ei m. That is, all quantum states are filled for i 0, 1, 2, c, j (with j ), and all quantum states are empty for i j 1, j 2, c (with j 1 ), as schematically shown in Fig. 3.6. More discussions will be given in Chap. 5 on the behavior of free electrons. For now, it is sufficient to say that the thermodynamic probability 1 for FD statistics at absolute zero temperature. Therefore, the entropy S 0 at T S 0 K for both the BE and FD statistics. However, MB statistics does not satisfy the third law and is not applicable to very low temperatures. 3.2.5 Macroscopic Properties in Terms of the Partition Function The partition function is an important quantity in statistical thermodynamics. Unlike the characteristics functions (such as the Helmholtz free energy and the Gibbs free energy defined in Chap. 2) used in macroscopic thermodynamics, the physical meaning of the partition function is not immediately clear. However, the introduction of the partition function

CHAPTER 3

70

FIGURE 3.6 Schematic of the Fermi-Dirac distribution at 0 K.

allows the calculation of macroscopic thermodynamic properties from the microscopic representation. There are different types of partition functions. For MB statistics, the partition function is defined as Z Nea a gieei /kBT `

(3.25)

i0

Therefore,

N ei /kBT ge Z i

Ni

a gi e `

Since

c

'(ln Z) 1 'Z d a b 'T V,N Z 'T V,N

ei /kBT

i0

a

(3.26) ei kBT

a gie `

ei /kBT

b

2

Uea kBT 2 U Nea NkBT 2

i0

U NkBT 2 c

we have

'(ln Z ) d 'T V,N

(3.27)

Using the corrected MB statistics given in Eq. (3.18), we can express the entropy as S kB ln(MB/N!) kB a Ni a1 ln `

i0

gi b Ni

Z Z kB a Ni a1 ln bei b NkB NkBln kBbU N N i0 `

(3.28a)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

71

Had we not divided MB by N!, we would get S NkBln Z kBbU , which is different from Eq. (3.28a) by a constant. After substituting  and U into Eq. (3.28a), we obtain S NkB e 1 ln

'(ln Z) Z

Tc d f N 'T V,N

(3.28b)

The Helmholtz free energy is Z A U TS NkBTa1 ln b N

(3.29)

'(ln Z) 'A b NkBT c d 'V T,N 'V T,N

(3.30)

The pressure is P a

The enthalpy H and the Gibbs free energy G can also be obtained. The partition function is now related to the macroscopic thermodynamic properties of interest for simple substances.

3.3 IDEAL MOLECULAR GASES An important application of statistical mechanics is to model and predict the thermal properties of materials. In this section, the application of MB statistics to obtain the equation of state and the velocity distributions for ideal molecular gases is presented. The microscopic theories of the specific heat for ideal monatomic and polyatomic gases are given subsequently.

3.3.1 Monatomic Ideal Gases For a monatomic ideal gas at moderate temperatures, MB statistics can be applied, and the translational energies are e

1 1 m(v2x v2y v2z ) mv2 2 2

(3.31)

Consider a volume element in the phase space, dxdydzdpxdpydpz, where p mv is the momentum of a molecule. The accuracy of specifying the momentum and the displacement is limited by x px , h, given by the uncertainty principle. The degeneracy, which is the number of quantum states (boxes of size h3) in a volume element of the phase space, is given by dg

dxdydzdpxdpydpz 3

h

m3 dxdydzdvxdvydvz h3

(3.32)

Many useful results were obtained before quantum mechanics by assuming that h3 is some constant. A more rigorous proof of Eq. (3.32) will be given in Sec. 3.5. When the space between energy levels are sufficiently close, the partition function can be expressed in terms of an integral as Zt 1 ee/kBTdg or m m3 (v2 v2y v2z )d dvxdvydvz Zt 9dxdydz 9 3 exp c 2k T x h B

(3.33)

CHAPTER 3

72

The space integration yields the volume V, and the velocity integration can be individually performed, i.e., 3

`

`

exp a

Hence,

mv2x 2pkBT bdvx Å m 2kBT

Zt V a

2pmkBT 2

h

b

(3.34)

3/2

(3.35)

Therefore, ea

V 2pmkBT 3/2 b a N h2

(3.36)

which is indeed much greater than unity at normal temperatures for most substances, suggesting that the MB statistics is applicable for ideal molecular gases. At extremely low temperatures, intermolecular forces cannot be neglected and the molecules are not independent anymore. From Eq. (3.30), we have P NkBT ['(ln Z)>'V ]T,N NkBT/V ; i.e., PV NkBT

or

(3.37)

P nkBT

where n N/V is the number density. The Boltzmann constant is the ideal (universal) gas constant on the molecular basis, i.e., kB R/NA. The internal energy, the specific heats, and the absolute entropy can also be evaluated. U NkBT 2 c

'(ln Z) 3 d NkBT 'T V,N 2

(3.38) 3

which is not a function of pressure. The molar specific internal energy is u 2 RT , and the molar specific heats are

and

cv a

3 'u b R 'T V 2

(3.39)

cp a

5 'h b R 'T P 2

(3.40)

The above equations show that the specific heats of monatomic gases are independent of temperature, except at very high temperatures when electronic contributions become important. The molar specific heats do not depend on the type of molecules, but the same is not true for mass specific heats. Using Eq. (3.28b), the absolute entropy can be expressed as S NkB e

V 2pmkBT 3/2 5 b df

ln c a N 2 h2

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

73

Therefore, the molar specific entropy is a function of T and P, i.e., s(T,P) R e

kBT 2pmkBT 3/2 5

ln c b df a P 2 h2

(3.41)

This is the Sackur-Tetrode equation.

3.3.2 Maxwell’s Velocity Distribution Rewrite Ni gi eaeei/kBT as dN dg eaee/kBT . In a volume V and from v to v dv (i.e., vx to vx dvx, vy to vy dvy, and vz to vz dvz), the number of molecules dN per unit volume may be expressed as 3/2 N dN m 2 m3 h2 3 dvxdvydvz a b expa vb V V 2pmkBT 2kBT h

(3.42)

or f(v)dv

3/2 dN mv2 m na b expa bdv V 2pkBT 2kBT

(3.43)

where f(v) is the Maxwell velocity distribution in a unit volume. Notice that 3/2 f(v) mv2 m b expa b F(v) n a 2pkBT 2kBT

(3.44)

which is a Gaussian distribution. Notice that v2 v v v2 v2x v2y v2z. The distribution of velocity component is also Gaussian, such that F(v) F(vx)F(vy)F(vz)

(3.45)

Taking the x component as an example, we can write F(vx) a

1/2 mv2x m b expa b 2pkBT 2kBT

(3.46)

The speed distribution may be obtained from the following by integrating the velocity distribution in a spherical shell (i.e., over the solid angle of 4p). 3/2 mv2 m F(v)dv 6 F(v)dv 6 a b exp a b v2 ddv 2pkBT 2kBT 4p

4p

Therefore, F(v) 4pa

3/2 m m 2 vb b v2 exp a 2pkBT 2kBT

(3.47)

Figure 3.7 plots the speed distribution of He gas at 0, 300, and 800°C. When evaluating kBT , we must convert T to absolute temperature. It can be seen that more molecules will be at higher speeds as the temperature increases. It should be noted that F(v 0) 0 but F(v) is maximum at v 0. In the speed coordinate, an interval between v and v dv corresponds to a spherical shell in the velocity space. Even though F(v) is maximum at v 0,

CHAPTER 3

74

FIGURE 3.7 Speed distribution for helium gas at different temperatures.

the probability of finding a molecule per unit speed interval decreases to 0 as v S 0, which is caused by the associated decrease in the volume of the spherical shell. Example 3-5. Find the average speed and the root-mean-square speed for a He gas at 200°C at 100 kPa. What if the pressure is changed to 200 kPa? What are the most probable velocity and the most probable speed?

The average speed may be obtained from either the velocity distribution or the speed distribution. That is Solution.

` 8kBT v 9 vF(v)dv 3 vF(v)dv pm Å 0

(3.48)

The average of v2 is (see Appendix B.5)

` 3kBT v2 9 v2F(v)dv 3 v2F(v)dv m

(3.49a)

0

Therefore the root-mean-square speed is 3kBT vrms 2v2 Å m

(3.49b)

Plugging in the numerical values, we have v 1582 m/s and vrms 1717 m/s for He gas at 200°C. We also notice that the pressure has no effect on the speed distribution, unless it is so high that intermolecular forces cannot be neglected. The most probable velocity vmp 0 because of the symmetry in the Gaussian distribution. We can obtain the most probable speed by setting F(v) 0, i.e., 2v exp a

mv mv2 mv2 b v2 a b exp a b 0 2kBT kBT 2kBT

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

75

The solution gives the most probable speed as vmp !2kBT/m. For He gas at 200°C, it gives vmp 1402 m/s. Note that vmp : v¯ : vrms !2 :!8/p :!3 < 1.4 : 1.6 : 1.7.

An important consequence for Eq. (3.49a) is that temperature is related to the mean kinetic energy of the molecule, i.e.,

Comment.

1 1 1 1 2 mv mv2y mv2z kBT 2 x 2 2 2

(3.50)

The internal energy of a monatomic gas given in Eq. (3.38) is the sum of the kinetic energies of all molecules.

3.3.3 Diatomic and Polyatomic Ideal Gases Additional degrees of freedom or energy storage modes must be considered for diatomic and polyatomic molecules, besides translation. The molecule may rotate about its center of gravity, and atoms may vibrate with respect to each other. For a molecule consisting of q atoms, each atom may move in all three directions, and there will be a total of 3q modes. Consider the translation of the molecule as a whole; there are three translational degrees of freedom or modes, i.e., ft 3. For diatomic molecules or polyatomic molecules whose atoms are arranged in a line (such as CO2), as shown in Fig. 3.8, there are two rotational

FIGURE 3.8 (a) A diatomic molecule, showing two rotational and one vibrational degrees of freedom. (b) CO2 molecule, where the atoms are aligned.

degrees of freedom or modes, i.e., fr 2. Therefore, there are fv 3q 5 vibrational modes, each consisting of two degrees of freedom corresponding to the kinetic energy and the potential energy. For polyatomic molecules whose atoms are not aligned (such as H2O and CH4, see Fig. 3.9), there are three rotational degrees of freedom, i.e., fr 3. The vibrational modes are thus fv 3q 6. The total energy of a molecule may be expressed as the sum of translational, rotational, and vibrational energies, i.e., e et er ev. For simplicity, we have neglected contributions from the electronic ground state and chemical dissociation, which can be included as additional terms in evaluating the internal energy and the entropy.1 At high temperatures, the vibration mode can be coupled with the rotation mode. Here, however, it is assumed that these modes are independent. The partition function can be written as Z Zt Zr Zv a a gt eet/kBT b a a gr eer/kBT b a a gv eev/kBT b

(3.51)

CHAPTER 3

76

FIGURE 3.9 (a) A H2O molecule in which the atoms are not aligned. (b) The tetrahedral methane (CH4) molecule.

For polyatomic atoms, Eq. (3.31) through Eq. (3.36) hold for the translational modes. Zr and Zv are internal contributions that do not depend on volume; therefore, Eq. (3.37) also holds. Since the degrees of freedom are independent of each other, Maxwell’s velocity and speed distributions discussed in Sec. 3.3.2 still hold for polyatomic gases. The problem now is to determine the rotational and vibrational energy levels and degeneracies. Generally speaking, there exists a certain characteristic temperature associated with each degree of freedom. The characteristic temperature for translation is very low for molecular gases. On the other hand, the characteristic temperature for rotation is slightly higher, and that for vibration is usually very high, as can be seen from Table 3.2 for selected diatomic molecules. If the temperature is much less than the characteristic temperature of a certain mode, then the contribution of that mode to the energy storage is negligible. For the temperature much higher than the characteristic temperature, however, there often exist some asymptotic approximations. Rotation. A quantum mechanical analysis of a rigid rod, to be derived in Sec. 3.5.3, shows that the rotational energy levels are given by el r l(l 1) T kBT

(3.52)

TABLE 3.2 Characteristic Temperatures of Rotation and Vibration for Some Diatomic Molecules Substance Hydrogen Deuterium Hydrogen chloride Nitrogen Carbon monoxide Nitric oxide Oxygen Chloride Sodium vapor

Symbol

r (K)

v (K)

H2 D2 HCl N2 CO NO O2 Cl2 Na2

87.5 43.8 15.2 2.86 2.78 2.45 2.08 0.35 0.08

6320 4490 4330 3390 3120 2745 2278 814 140

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

77

Here, r is the characteristic temperature for rotation and is given by r h2/(8p2kBI), where I is the moment of inertia of the molecule about the center of mass. The larger the value of I, the smaller the characteristic temperature will be. This is clearly shown in Table 3.2. The degeneracy of rotational energy levels is gl

2l 1 s

(3.53)

where s is a symmetry number that arises from molecular symmetry: s 1 if the atoms are of different types (such as in a NO or CO molecule), and s 2 if the atoms are the same (such as in a O2 or N2 molecule). r 2l 1 s exp cl(l 1) T d l0

Zr a `

(3.54)

This series converges very fast for r /T 0.5, since 6r 12r 2r 1 b 5 exp a b 7 exp a b cd Zr s c1 3 exp a T T T For T/r 1, Eq. (3.54) may be expanded to give (see Problem 3.26) Zr

4 r 3 c T 1 r 2 1 r d a b

a b

c1 a b

3 T 15 T 315 T rs

(3.55)

At temperatures much higher than the characteristic temperature of rotation, T/r W 1, the above equation reduces to Zr

T sr

(3.56)

Under this limit, the contribution of the rotational energy to the internal energy becomes (3.57)

Ur < NkBT

The contribution to the molar specific heat by the two rotational degrees of freedom is cv,r R

(3.58)

Vibration. The vibration in a molecule can be treated as a harmonic oscillator. For each vibration mode, the quantized energy levels are given in Sec. 3.5.5 as 1 ev,i ai bhn, i 0, 1, 2, . . . 2

(3.59)

where n is the natural frequency of vibration, and the ground-state energy is 12hn. The vibrational energy levels are not degenerated, i.e., gv,i 1. Therefore, we can write Zv a e(i 1/2)hn/kBT ev/2T a eiv /T `

`

i0

i0

CHAPTER 3

78

where v hn/kB is a characteristic temperature for vibration and is listed in Table 3.2 for several diatomic molecules. The vibrational partition function becomes Zv

ev /2T ev /2T /T v /T 1e e v 1

(3.60)

Its contribution to the internal energy and the specific heat can be written as 1 1 Uv NkBv a /T b 2 e v 1 and

cv,v R

2v

ev /T T 2 (ev /T 1)2

(3.61)

(3.62)

At T V v, the vibrational mode contributes to the internal energy but not to the specific heat. At T 1.5v, Uv almost linearly depends on T and cv,v < R. In classical statistical mechanics, it is believed that each degree of freedom contributes to the stored thermal energy with an amount of 21 kBT and results in a specific heat of 12 kB on the particle base. This is called the equipartition principle. The contribution of each vibrational mode is R not R/2, due to the fact that each vibrational mode includes a kinetic component and a potential component for energy storage and is generally considered as two degrees of freedom. It should be noted that the equipartition principle is only applicable at sufficiently high temperatures and for particles that obey MB statistics or, in some limiting cases, BE statistics. Because energy is additive, as is the specific heat, we can write cv cv,t cv,r cv,v

(3.63)

The result is schematically shown in Fig. 3.10. One can see that for a diatomic ideal gas, cv 2.5R if

r V T V v

FIGURE 3.10 Typical specific heat curve of a diatomic ideal gas.

(3.64)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

79

which happens to be near room temperature for many gases such as nitrogen and carbon monoxide; see Table 3.2. Figure 3.11 plots the specific heat for several real gases at sufficiently

FIGURE 3.11 Specific heat at constant volume for several ideal gases.

low pressure so that the ideal gas model is applicable. It should be noted that, for hydrogen, nuclear spin is important and Eq. (3.54) needs to be modified to account for the spin degeneracy.1,2 However, Eq. (3.57) and Eq. (3.58) predict the right trend and are applicable at temperatures much higher than r. At extremely high temperatures (say 3000 K), electronic contributions and the coupling between rotation and vibration become important. Although Eq. (3.63) is the correct expression for the specific heat at moderate temperatures, two additional partition functions must be included to correctly evaluate the internal energy and the entropy (see Problem 3.22). We limit the derivations to the specific heat, which is closely related to heat transfer calculations. The characteristic temperature for rotation is usually very small for polyatomic molecules because of their large moments of inertia. Therefore, the rotational degrees of freedom can be assumed as fully excited in almost any practical situation. Each rotational degree of freedom will contribute R/2 to the molar specific heat. For molecules whose atoms are aligned (such as CO2), the rotational contribution to the specific heat is R, and 3q5 z2i ezi 5 , zi v,i /T cv R R a z 2 i 2 i1 (e 1)

(3.65)

If T W v,i, then cv S R(3q 2.5). For molecules such as H2O and CH4 whose atoms are not aligned, we have, 3q6 z2i ezi cv 3R R a z 2 i i1 (e 1)

(3.66)

CHAPTER 3

80

TABLE 3.3 Vibrational Modes of Several Gases, Where the Integer in the Parentheses Indicates the Number of Degenerate Modes Type

cm1

cm1

cm1

cm1

Total fv

CO2 H2O CH4

667 (2) 1595 1306 (3)

1343 3657 1534 (2)

2349 3756 2916

– – 3019 (3)

4 3 9

In this case, cv S R(3q 3) at T W v,i. Again, electronic contribution may be significant at very high temperatures. Table 3.3 lists the vibrational frequencies for several commonly encountered gases. The unit of frequency is given in inverse centimeter (cm1), which is often used in spectroscopic analyses. Note that v hnkB hc0nkB, where n is the wavenumber in cm1 if we take c0 3 1010 cm/s. That is v (K) 1.44 n (cm1). One can use this table to estimate the specific heat of these gases based on Eq. (3.65) or Eq. (3.66). In reality, vibration-rotation interactions result in multiple absorption lines around each vibration mode, which can be observed through infrared absorption spectroscopy. Figure 3.12

FIGURE 3.12 Infrared absorption spectrum of ambient air obtained with a Fourier-transform infrared spectrometer.

shows the molecular absorption spectra of CO2 and H2O measured with a Fourier-transform infrared spectrometer. The absorption spectra were obtained by comparing the spectrum when the measurement chamber is open with that when the chamber is purged with a nitrogen gas, which does not absorb in the mid-infrared region. The concentrations of H2O and CO2 in the experiments were not controlled since the purpose is to demonstrate the infrared absorption frequencies only. While the resolution of 1 cm1 is not high enough to resolve very fine features, the absorption bands near 670 cm1 due to degenerate bending modes and near 2350 cm1 due to asymmetric stretching mode in CO2 can be clearly seen. Note that the symmetric vibration mode of CO2 at 1343 cm1 is infrared inactive, i.e., it does not show up in the absorption spectrum but can be observed with Raman spectroscopy. Furthermore, the vibration-rotation interactions cause multiple lines in the water vapor absorption bands from 1300 to 2000 cm1 and from 3500 to 4000 cm1.

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

81

How many rotational degrees of freedom are there in a silane (SiH4) molecule? If a low-pressure silane gas is raised to a temperature high enough to completely excite its rotational and vibrational modes, find its specific heats.

Example 3-6.

For SiH4, there will be three translational degrees of freedom, i.e., t 3, three rotational degrees of freedom, i.e., r 3, and fv 3q 6 9 vibrational degrees of freedom. If all the modes are excited, the specific heat for constant volume will be cv 1.5R 1.5R 9R 12R. Given that M 32, we find cv 3.12 kJ/(kg K), cp 3.38 kJ/(kg K), and g 13R/12R 1.083. The actual specific heats would be much smaller at moderate temperatures. Solution.

3.4 STATISTICAL ENSEMBLES AND FLUCTUATIONS We have finished the discussion about statistical thermodynamics of independent particles without mentioning ensembles. In a system of independent particles, there is no energy associated with particle-particle interactions or the configuration of the particles. For dependent particles or dense fluids, the previous analysis can be extended by using statistical ensembles, which was pioneered by J. Willard Gibbs (1839–1903) in the late nineteenth century in his 1902 book, Elementary Principles of Statistical Mechanics. Statistical ensembles are a large set of macroscopically similar systems. When the properties are averaged over a properly chosen ensemble, the macroscopic properties can be considered as the same as the time-averaged quantity of the same system. There are three basic types of ensembles: microcanonical ensemble, canonical ensemble, and grand canonical ensemble.1,5 A microcanonical ensemble is composed of a large set of identical systems. Each system in the ensemble is isolated from others by rigid, adiabatic, and impermeable walls. The energy, volume, and number of particles in each system are constant. The results obtained using the microcanonical ensemble for independent particles are essentially the same as what we have obtained in previous sections. It is natural to ask the question as to what extent the statistical mechanics theory presented in previous sections will be valid for nanosystems. If the equilibrium properties are defined based on a large set of microcananical ensembles and considered as the time-averaging properties of the system, there will be sufficiently large number of particles in the whole ensemble to guarantee the basic types of statistics, and the thermodynamics relations derived in Secs. 3.1 and 3.2 are still applicable. On the other hand, the difference between the energy levels due to quantization may be large enough to invalidate the substitution of summation with integration. We will discuss the energy level quantization further in Sec. 3.5. In deriving the properties of ideal gases in Sec. 3.3, the consideration of the translational, rotational, and vibrational degrees of freedom is on the basis of individual molecules. Therefore, the conclusions should be applicable to systems under thermodynamic equilibrium. In a canonical ensemble, each system is separated from others by rigid and impermeable walls, which are diathermal. All systems have the same volume and number of particles. However, the systems can exchange energy. At equlibrium, the temperature T will be the same for all systems. An important result of applying the canonical ensemble is that the energy fluctuation (i.e., the standard deviation of energy of the system) is proportional to 1/!N , where N is the total number of independent particles. In a grand canonical ensemble, each system is separated from others by rigid, diathermal, and permeable walls. While the volume is fixed and is the same for each system, the number of particles as well as the energy of each system can vary. The temperature and the chemical potential must be the same for all systems at equilibrium. This allows the study of density fluctuations for each system. The result for monatomic molecules yields that the density fluctuation is also proportional to 1/!N .

CHAPTER 3

82

The canonical and grand canonical ensembles are essential for the study of complex thermodynamic systems, such as mixtures, chemical equilibria, dense gases, and liquids, which will not be further discussed in this text. Interested readers can find more details from Tien and Lienhard1 and Carey.5 A simple theory based on independent particles of phonons and electrons will be discussed in Chap. 5. While the partition function can also be used to study the thermodynamic relations of solids, the approach used in solid state physics will be adopted in a detailed study of the properties of solids presented in Chap. 7.

3.5 BASIC QUANTUM MECHANICS So far we have largely avoided the derivations and equations involving quantum mechanics, by using the conclusions from quantum theory on a need basis without proof. In this section, we shall present the basics of quantum mechanics to enhance the understanding of the materials already presented and to provide some background for future chapters. In classical mechanics, the state of a system is completely described by giving the position and the momentum of each particle in the system at any given time. The equation of motion is given in Eq. (3.1), which is also the basis for molecular dynamics. The position and the momentum of each particle are precisely determined using the initial values and the forces exerted on it afterward. According to the wave-particle duality, particles also have wave characteristics. The results are described in quantum mechanics by the Schrödinger wave equation. The solution of the Schrödinger equation is given in the form of a wavefunction, which describes the probabilities of the possible outcome rather than the exact position and momentum of the particle. Another important aspect in quantum mechanics is the use of operators in mathematical manipulations.

3.5.1 The Schrödinger Equation Consider the following equation that describes a wave in the x direction (see Appendices B.6 and B.7): ~ (3.67) (x,t) Aei(2px/l2pnt) ~ where A Ar iAs is a complex constant, l is the wavelength, and n is the frequency. One can take the real part of , i.e., Re() Arcos(2px/l 2pnt) Assin(2px/l 2pnt) which is a cosine function of x for any given t. The complex notation is convenient for obtaining derivatives. If Eq. (3.67) is used to describe a moving particle, with a mass m and a momentum p, it can be shown that iU

' h p 'x l

p2 U2 '2 EK 2 2m 'x 2m

(3.68a)

(3.68b)

' (3.68c) iU hn e 't where U h/2p, which is the Planck constant divided by 2p, EK is the kinetic energy of the particle, and e is the total energy of the particle. In writing Eq. (3.68), we have applied and

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

83

the concept of wave-particle duality to relate p h/l and e hn. If the particle possesses only the kinetic and potential energies, we have e EK EP

p2

(r) 2m

(3.69a)

where (r) (x,y,z) is the potential function that depends on the position of the particle. Define the Hamiltonian operator in the 3-D case as ^

H

U2 2 = (r) 2m

(3.69b)

^

It can be seen that H e. Hence,

' U2 2 = (r) iU 't 2m

(3.70)

which is the time-dependent Schrödinger equation.8 From e iU 't , one can obtain '

(r,t) 0(r)eiet/U

(3.71a)

The general time dependence for different energy eigenvalues can be written as a summation: (r,t) A101(r)eie1t/U A202(r)eie2t/U c

(3.71b)

Therefore, the key to solve the Schrödinger equation becomes how to obtain the initial wavefunctions. For this reason, Eq. (3.70) can be rewritten as follows:

U2 2 = (r) e 2m

(3.72)

which is called the time-independent Schrödinger equation. The solution gives the wavefunction (r), which is often expressed in terms of a set of eigenfunctions, 1,2,3, c, each with an eigenvalue energy, e1,e2,e3, c , respectively. The solution, or the wavefunction, must satisfy * 3 dV 1

(3.73)

V

where the superscript * denotes the complex conjugate since the wavefunction is in general complex, and the integration is over the whole volume. The physical significance is that the probability of finding the particle in the volume must be 1. The wavefunction is also called a state function because it describes the quantum state of the particle, and * is called the probability density function. The average or expectation value of any physical quantity h is calculated by ^ kl 3 * dV V

(3.74)

^ where signifies an operator of . For example, the average energy of the particle is

kel 3 *HdV ^

V

(3.75)

CHAPTER 3

84

Several examples are discussed in the following sections to show how to obtain the wavefunctions and the physical significance of the solutions. 3.5.2 A Particle in a Potential Well or a Box The1-D potential well is illustrated in Fig. 3.13a, where a particle is confined within a physical space between 0 x L and the particle can move parallel to the x axis only. This is

FIGURE 3.13 Illustration of (a) a 1-D potential well and (b) the eigenfunctions.

equivalent of saying that the potential energy is zero inside and infinite outside the potential well, i.e., (x) e

0, for 0 x L `, at x 0 and L

(3.76)

U2 2 = e 2m

(3.77)

The Schrödinger equation becomes

whose solutions are (x) A cos (kx) B sin (kx), where k !2me/U2. Because the particle is confined inside the well, the wavefunction must be zero outside the potential well. Another requirement for the wavefunction is that it must be continuous. Thus, we must have (0) (L) 0. This requires that A 0 and, by taking only the positive k values, we have kL np, n 1, 2, 3 c

(3.78)

The eigenfunctions are therefore n(x) Bn sin (npx/L), which can be normalized by letting L

* 3 n(x)n(x)dx 1 to get

0

n(x)

npx 2 sin a b ÅL L

(3.79)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

85

Therefore, the solution requires the particle to possess discretized energy values, i.e., its energy cannot be increased continuously but with finite differences between neighboring states. It can easily be seen that

en

h2n2 8mL2

(3.80)

The quantized energy eigenvalues are called energy levels for each quantum state, and the index n is called a quantum number. The eigenfunctions are standing waves as shown in Fig. 3.13b for the first four quantum states. For molecules, the difference between energy levels is very small and the energy distribution can often be approximated as a continuous distribution. For electrons at very small distances, L S 10 nm for example, quantization may be important. The effects of quantum confinement take place when the quantum well thickness becomes comparable to the de Broglie wavelength of the particle, such as electrons or holes in a semiconductor. Quantum wells can be formed by a sandwiched structure of heterogeneous layers, such as AlGaAs/GaAs/AlGaAs. The bandgap of the outer layers is larger than that of the inner layers to form an effective potential well. These structures are used for optoelectronic applications such as lasers and radiation detectors. The thickness of the active region can be a few nanometers. In some cases, multiple quantum wells are formed with periodic layered structures, called superlattices, which have unique optical, electrical, and themal properties. Derive the uncertainty principle. Suppose the wavefunction is given by Eq. (3.79) for a particle with energy n given in Eq. (3.80).

Example 3-7.

Solution. To find the average position of the particle, we use npx 2 L kxl 3 *xdx 3 x sin 2 a bdx L 0 L 2 0 L

L

The variance of x, s2x kx kxll2 kx2 l 2kxl2 kxl2 kx2 l kxl2. With npx 2 L2 L2 bdx x 2 sin 2 a 2 2 L 30 L 3 2n p L

kx2 l

we obtain the standard deviation of x as sx La

1 1 2 2b 12 2n p

1/2

For the momentum, we use the operator p S iU('/'x). Hence, d npx npx 2np bcosa bdx 0 k pl 3 * aiU bdx iU 2 3 sin a L L dx L 0 0 L

L

d 2 npU kp2 l 3 *(U2) 2 dx a b L dx 0 L

and

2

We have sp npU /L and obtain the following expression: sxsp

1/2 U p2n2 a 2b 2 3

(3.81)

CHAPTER 3

86

Taking the smallest quantum number, n 1, we get sxsp < 0.5678U U/2, which is a proof of the uncertainty principle given in Eq. (3.9).

Next, consider a free particle in a 3-D box, 0 x a, 0 y b, 0 z c. It can be shown that the (normalized) eigenfunctions are x,y,z

nypy nzpz nxpx 8 sin a a b sin a b sin a c b Å abc b

(3.82)

with the energy eigenvalues:

ex,y,z

2 n2y n2z h2 nx a 2 2 2b 8m a b c

(3.83)

where nx,ny,nz 1,2,3, c When a b c V1/3, Eq. (3.83) can be simplified as ex,y,z

h2 (n2 n2y n2z ) 8mV 2/3 x

(3.84)

Let (n2x n2y n2z )1/2, then we can evaluate the number of quantum states between and d, which is nothing but the degeneracy. For sufficiently large V, the quantum states are so close to each other that the volume within the spherical shell between h and h dh is equal to the number of quantum states. Only one-octant of the sphere is considered in Eq. (3.84) because nx 0, ny 0, nz 0. The total volume is therefore one-eighth of the spherical shell; hence, 2pV(2m)3/2 1/2 1 dg 4p2d e de 8 h3

(3.85)

With e 12mv2 and de mvdv, we obtain dg

m3V 4pv2dv h3

(3.86)

This equation is essentially the same as Eq. (3.32), with dxdydz V and dvx dvydvz 4pv2dv. Equation (3.86) provides a rigid proof of Eq. (3.32), which is the translational degeneracy. It should be noted that the classical statistical mechanics results in the same expression for U and p, as well as the Maxwell velocity distribution for ideal gases. However, the constant h must be included to correctly express S as in Eq. (3.41). Equation (3.86) will also be used in Chap. 5 to study the free electron gas in metals. When using the momentum p mv as the variable, we have dg

V 4pp2dp h3

(3.87)

Because Eq. (3.87) does not involve mass, it is also applicable to phonons and photons as will be discussed in Chaps. 5 and 8.

3.5.3 A Rigid Rotor The rigid rotor model can be used to study the rotational movement of diatomic molecules as well as the movement of an electron in a hydrogen atom. Consider two particles separated by a fixed distance r0 r1 r2 as shown in Fig. 3.14. The masses of the particles are

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

87

FIGURE 3.14 Schematic of a rotor consisting of two particles.

m1 and m2, respectively. Since the center of mass is at the origin, we have m1r1 m2r2. The moment of inertia is I m1r21 m2r22 mrr20

(3.88)

where mr m1m2/(m1 m2) is the reduced mass. We can study the rotational movement of the particles by considering a particle with a mass of mr that rotates around at a fixed distance r r0 from the origin in the u and f directions. In the spherical coordinates, =2

' ' '2 1 ' 2' 1 1 ar a sin u b 2 2 b 2 2 'r 'r 'u 'u r r sin u r sin u 'f2

(3.89)

Because r ; r0, the derivative with respect to r vanishes. The potential energy is zero for free rotation. By setting the mass to be mr and 0 in Eq. (3.72) and noticing that mrr20 I , we obtain 2Ie ' 1 ' 1 '2 2 asin u b

'u sin u 'u sin 2u 'f2 U

(3.90)

This partial differential equation can be solved by separation of variables. We get two ordinary differential equations by letting (u,f) P(u)c(f), i.e.,

and

d 2c m2c df2

(3.91)

dP 2Ie m2 1 d asin u b a 2 bP 0 sin u du du U sin 2 u

(3.92)

Here, m is a new eigenvalue, and the periodic boundary conditions shall be applied to P and c, respectively. The solution of Eq. (3.91) is readily obtained as (3.93) c(f) Aeimf

CHAPTER 3

88

with m 0,1,2, c , to satisfy the periodic boundary conditions: c(f) c(2p f). A transformation, cos u j, can be used so that Eq. (3.92) becomes (1 j2)

d2P dP 2Ie m2

a 2 2j bP 0 2 dj dj U 1 j2

(3.94)

Because u is defined from 0 and p, we have 1 x 1. In order for Eq. (3.94) to have solutions that are bounded at x 1, 2Ie/U2 l(l 1), where l is an integer that is greater than or at least equal to the absolute value of m. Therefore, the energy eigenvalues are el

U2 l(l 1), l Z m Z , Z m Z 1, Z m Z 2, etc. 2I

(3.95)

Equation (3.94) is called the associated Legendre differential equation. The solutions are the associated Legendre polynomials given as Pml(j)

(1 j2)m/2 dm l 2 (j 1)l l!2l djm l

(3.96)

Finally, after normalization, the standing wavefunctions can be expressed as ml(u,f)

c

(2l 1)(l m)! 1/2 m d Pl (cos u)eimf 2(m 1)! 22p 1

(3.97)

Discussion. It can be seen that Eq. (3.95) is identical to Eq. (3.52). The energy level is determined by the principal quantum number l. On the other hand, for each l, there are 2l 1 quantum states corresponding to each individual m, because m can take 0, 1, 2 up to l. This means that the degeneracy gl 2l 1. When the two atoms are identical, such as in a nitrogen molecule, the atoms are indistinguishable when they switch positions. The degeneracy is reduced by a symmetry number, as given in the expression of Eq. (3.53). It should be noted that the nuclear spin degeneracy is important for hydrogen (see Problem 3.27).1

3.5.4 Atomic Emission and the Bohr Radius A hydrogen atom is composed of a proton and an electron. Since the mass of the proton is much greater than that of the electron, it can be modeled as the electron moving around the nucleus. The mass of the electron is me 9.11 1031 kg, and the position of the electron can be described in the spherical coordinates as r (r,u,f). The force exerted on the electron is Coulomb’s force, which gives a potential field C1 (r) r

(3.98)

where C1 e2/(4pe0) 2.307 1028 N # m2, with the electron charge e 1.602 1019 C and the dielectric constant e0 8.854 1012 F/m. Let (r,u,f) R(r)P(u)c(f). In doing the separation of variables, we notice that the potential is independent of u and f, and the total energy is equal to the sum of the rotational energy and the energy associated with r. The eigenvalues for the rotational energy are given in Eq. (3.95). Using Eq. (3.72) and Eq. (3.89), we can write the equation for R(r) as follows: C1 l(l 1)U2 dR d U ar2 b a r e bR 0 2 dr 2I dr 2me r

(3.99)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

89

which is the associated Laguerre equation, and its solutions are the associated Laguerre polynomials. The solutions give the energy eigenvalues as5,8 en

meC21 2U2n2

(3.100)

where the negative values are used for convenience to show that the energy increases with the principal quantum number n. For n 1, meC12/2U2 13.6 eV, as shown in Fig. 3.15.

FIGURE 3.15 Electron orbits (left) and energy levels (right) in a hydrogen atom. The ionization energy is the energy required for an electron to escape the orbit.

Note that 1 eV 1.602 1019 J. When the electron is in a higher energy state, it has a tendency of relaxing to a lower energy state by spontaneously emitting a photon, with precisely the same energy as given by the energy difference between the two energy levels: hn ei ej

meC21 1 1 a 2b 2U2 n2j ni

(3.101)

The emission or absorption of photons by electrons is called electronic transitions. When i 3 and j 1, we have hn 12.1 eV, corresponding to the wavelength of 102.6 nm (ultraviolet), which is the second line in the Lyman series. When i 3 and j 2, we have hn 1.89 eV, corresponding to the wavelength of 656.4 nm (red), which is the first line in the Balmer series. A more detailed description of the atomic emission lines can be found from Sonntag and van Wylen.2 The next question is: What is the radius of a particular electron orbit? This is an important question because it gives us a sense of how small an atom is. When a particle is in an orbit, the classical force balance gives that C1 v2 (3.102) me a r b 2 r which is to say that EK mev2/2 C1/2r, and the sum of the kinetic and potential energies is e EK EP

C1 C1 C1 r 2r 2r

(3.103)

Equations (3.100) and (3.103) can be combined to give discrete values of the radius of each orbit in the following: U2 2 (3.104) n a0n2 rn meC1

CHAPTER 3

90

When the electron is in the innermost orbit, the radius is given by a0 e0h2/(pme e2) 0.0529 nm, which is called the Bohr radius. Therefore, the hydrogen atom in its ground state can be considered as having a diameter of approximately 1 Å (Angstrom), or 0.1 nm. Niels Bohr (1885–1962) was a Danish physicist who received the Nobel Prize in Physics in 1922 for his contributions to the understanding of the structure of atoms and quantum physics. One should accept the quantum interpretation of the electron radius as a characteristic length, not the exact distance that the electron would rotate around the nucleus in the same manner a planet rotates around a star. 3.5.5 A Harmonic Oscillator The last example of quantum mechanics is the linear spring as shown in Fig. 3.16. Consider a 1-D oscillator with a mass m and the spring force F(x) Kx. The origin can be selected such that F(0) 0. It can be shown that the potential is x

1 (x) 3 F(x)dx Kx2 2 0

(3.105)

FIGURE 3.16 A linear spring.

$ From classical mechanics, we can solve Newton’s equation mx Kx 0 to obtain the solution x A sin (vt f0)

(3.106)

where constant A is the amplitude, constant f0 is the initial phase, and parameter v !K/m is the angular resonance frequency. It can be shown that the total energy e EK EP KA2/2 is a constant and the maximum displacement is A. The velocity is the largest at x 0 and zero at x A. The Schrödinger wave equation can be written as Kx2 U2 d2 b 0

ae 2m dx2 2

(3.107)

with the boundary condition being (x) 0 at x S ` . The constants can be grouped by using a 2me/U2 and b !Km/U. Then Eq. (3.107) can be transformed by using j !bx and (x) Q(j) exp ( j2/2) to d 2Q dQ a 2j

a 1bQ 0 b dj dj2

(3.108)

This is the Hermite equation, and the solutions are Hermite polynomials given by 2

Hn(j) (1)nej

dn j2 ae b djn

(3.109)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

91

when a and b must satisfy the eigenvalue equation: a 1 2n, b

n 0, 1, 2, c

(3.110)

The normalized wavefunctions can be written as n(x) a

bx2 2b/p 1/2 b Hn(b1/2x) exp a b n!2n 2

(3.111)

The energy eigenvalues can be obtained from Eq. (3.110) as 1 1 en an bU 2K/m an bUv 2 2

(3.112)

The above equation was used to study the vibrational contributions in diatomic molecules; see Eq. (3.59). The 1/2 term was not included in Planck’s original derivation of the blackbody radiation function. The significance lies in that if the ground-state energy is zero, both its kinetic energy and potential energy must be zero, suggesting that both the position and the momentum must be zero. This would violate the uncertainty principle. As mentioned earlier, in classical mechanics, the particle is limited to the region A x A, where A is the amplitude given in Eq. (3.106). This is not the case in the quantum theory, as shown in Fig. 3.17, for the first few energy levels and the associated wavefunctions. Notice that probability density function 2 is nonzero even though the absolute value of x exceeds !2e/K .

FIGURE 3.17 Wavefunctions and probability density functions for vibration energy levels.

The application of quantum theory allows us to predict the specific heat of ideal gases. In deriving the equations shown in Sec. 3.3.3, we have largely neglected nonlinear and anharmonic vibration, electronic contribution, and dissociation. These factors may become important at very high temperatures. The degeneracy due to the coupling of rotation and vibration can cause multiple absorption/emission lines in the infrared in polyatomic molecular gases, as shown in Fig. 3.12.

92

CHAPTER 3

3.6 EMISSION AND ABSORPTION OF PHOTONS BY MOLECULES OR ATOMS We have learned that the emission of photons is associated with transitions from a higher energy level to a lower energy level that reduces the total energy of the molecular system. The reverse process is the absorption of photons that increases the energy of the system through transitions from lower energy levels to higher energy levels. As discussed earlier, an electronic transition requires a large amount of energy, and the emitted or absorbed photons are at frequencies from deep ultraviolet (l < 100 nm) to slightly beyond the red end of the visible region (l < 1 mm). On the other hand, vibration or rotation-vibration modes lie in the mid-infrared (2.5 mm l 25 mm), while their overtones or higher-order harmonics lie in the near-infrared region (0.8 mm l 2.5 mm). Rotational modes alone may be active in the far-infrared and microwave regions (l 25 mm). Transitions between different energy levels of the molecules or atoms are called bound-bound transitions, because these energy states are called bound states. Bound-bound transitions happen at discrete frequencies due to quantization of energy levels. Dissociation or ionization can also occur at high temperatures. The difference between adjacent energy levels is very small because the electrons can move freely (i.e., not bound to the atom or the molecule). Therefore, free-free or bound-free transitions happen in a broadband of frequencies. In gases, these broader transitions occur only at extremely high temperatures. If a molecule at elevated energy states were placed in a surrounding at zero absolute temperature (i.e., empty space), it would lower its energy states by emitting photons in all directions until reaching its ground state. However, the emission processes should occur spontaneously regardless of the surroundings. Suppose the molecule is placed inside an isothermal enclosure, after a long time, the energy absorbed must be equal to that emitted to establish a thermal equilibrium with its surroundings. The thermal fluctuation of oscillators is responsible for the equilibrium distribution, i.e., Planck’s law developed in 1990. Einstein examined how matter and radiation can achieve thermal equilibrium in a fundamental way and published a remarkable paper, “On the quantum theory of radiation” in 1917.11 The interaction of radiation with matter is essentially through emission or absorption at the atomistic dimension, although solids or liquids can reflect radiation and small particles can scatter radiation. Einstein noticed that spontaneous emission and pure absorption (i.e., transition from a lower level to a higher level by absorbing the energy from the incoming radiation) alone would not allow an equilibrium state of an atom to be established with the radiation field. He then hypothesized the concept of stimulated or induced emission, which became the underlying principle of lasers. In a stimulated emission process, an incoming photon interacts with the atom. The interaction results in a transition from a higher energy state to a lower energy state by the emission of another photon of the same energy toward the same direction as the incoming photon. Saying in other words, the stimulated photon is a clone of the stimulating photon with the same energy and momentum. Whether an incoming photon will be absorbed, will stimulate another, or will pass by without any effect on the atom is characterized by the probabilities of these events. Understanding the emission and absorption processes is important not only for coherent emission but also for thermal radiation.12 While more detailed treatments will be given in later chapters, it is important to gain a basic understanding of the quantum theory of radiative transitions and microscopic description of the radiative properties. Consider a canonical ensemble of single molecules or atoms, with two nondegenerate energy levels, e1 and e2 (e2 e1), in thermal equilibrium with an enclosure or cavity at temperature T. Suppose the total number of particles is N, and let N1 and N2 be the number of particles at the energy level corresponding to e1 and e2, respectively. These particles do not interact with each other at all. The concept of canonical ensemble can be understood as if each cavity has only one atom, but there are N single-atom cavities with one atom in each cavity. As shown in Fig. 3.18, there are three possible interaction mechanisms, i.e., spontaneous

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

93

FIGURE 3.18 Illustration of the emission and absorption processes. (a) Spontaneous emission. (b) Stimulated emission. (c) Stimulated absorption.

emission, stimulated emission, and stimulated or induced absorption. Here, stimulated absorption refers to the process that the energy of the photon is absorbed, and consequently, the transition occurs from the lower energy level to the higher energy level. In a stimulated absorption process, the number of photons before the process is 1 and after the process is 1 1 0. In a stimulated emission process, the number of photons beforehand is 1 and afterward is 1 1 2. Therefore, stimulated emission is regarded also as negative absorption. Each of the photons involved in this process will have an energy equal to hn e2 e1 and a momentum hn/c. Transition from the higher energy level to the lower energy level cannot take place if the population of atoms on the higher energy level, N2 0, and vice versa. Einstein further assumed that the probability of transition is proportional to the population at the initial energy level, and spontaneous transition should be independent of the radiation field. Hence, the rate of transition from e2 to e1 due to spontaneous emission can be written as a

dN1 dN2 b a b AN2 dt A dt A

(3.113)

where A is Einstein’s coefficient of spontaneous emission. On the other hand, the transition rate due to stimulated emission should also be proportional to the energy density of the radiation field u(n,T). Thus, a

dN1 b BN2u(n,T) dt B

(3.114)

Stimulated absorption will cause a transition rate that is proportional to N1 and u(n,T): a

dN1 b CN1u(n,T) dt C

(3.115)

In Eq. (3.114) and Eq. (3.115), constants B and C are Einstein’s coefficients of stimulated emission and absorption, respectively. The combination of these processes must maintain a zero net change of the populations at equilibrium. Thus, AN2 BN2u(n,T) CN1u(n,T) 0

(3.116)

Atoms or molecules in a thermal equilibrium are described by the Maxwell-Boltzmann statistics of molecular gases given by Eq. (3.26): N1/N2 e(e2e1)/kBT ehn/kBT . Therefore, Eq. (3.116) can be rewritten as u(n,T)

A/B (C/B)ehn/kBT 1

(3.117)

94

CHAPTER 3

Comparing this equation with Planck’s distribution, Eq. (8.41) in Chap. 8, we see that B C and A/B 8phn3/c3. The two-level system can easily be generalized to arbitrary energy levels to describe the fundamental emission and absorption processes. The emission and absorption processes not only exchange energy between the field and the atom but also transfer momentum. How will an atom move inside a cavity? The phenomenon of a molecule or atom in a radiation field is like the Brownian motion, in which the radiation quanta exert forces on the molecule or the atom as a result of momentum transfer during each emission or absorption process. Consequently, the molecule or the atom will move randomly following Maxwell’s velocity distribution at the same temperature as the radiation field. The equilibrium radiation field, which obeys the quantum statistics (i.e., BE statistics) that was not realized until 1924, and the motion of a molecular gas, which obeys classical statistics, can be coupled to each other to become mutual equilibrium. Einstein also asserted that each spontaneously emitted photon must be directional, while the probability of spontaneous emission should be the same in all directions. In fact, Einstein’s 1917 paper complemented Planck’s 1900 paper on radiation energy quanta and his own 1905 paper on photoelectric emission and, thus, provided a complete description of the quantum nature of photons, although the name “photon” was not coined until 1928. At moderate temperatures, the population at higher energy states is too small for stimulated emission to be of significance for optical and thermal radiation. Thus, the absorption comes solely from induced absorption. When stimulated emission is important, the contributions of stimulated emission and stimulated absorption cannot be separated by experiments. The effect is combined to give an effective absorption coefficient by taking stimulated emission as negative absorption, whereas the emission of radiation includes solely the spontaneous emission.12 The effective absorption coefficient is proportional to the population difference, N1 N2. On the other hand, if a population inversion can be created and maintained such that N2 N1, the material is called a gain medium or active medium. In an active medium, stimulated emission dominates stimulated absorption so that more and more photons will be cloned and the radiation field be amplified coherently. The principle of stimulated emission was applied in 1950s and early 1960s for the development of maser, which stands for microwave amplification by stimulated emission of radiation, and laser, which stands for light amplification by stimulated emission of radiation.13 Lasers have become indispensable to modern technologies and daily life.

3.7 ENERGY, MASS, AND MOMENTUM IN TERMS OF RELATIVITY Special theory of relativity or special relativity predicts that energy and mass can be converted to each other. If we retain the definition of mass as in the classical theory, only energy conservation is the fundamental law of physics. The mass does not have to be conserved. On the other hand, for processes that do not involve changes below the atomic level or inside the nuclei, the mass can indeed be considered as conserved. According to the special relativity, the rest energy of a free particle is related to its mass and the speed of light by E0 mc2

(3.118)

The rest energy is simply the energy when the particle is not moving relative to the reference frame. Suppose the free particle is moving at a velocity v in a given reference frame, then its momentum is given by14 p

mv 21 v2/c2

(3.119)

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

95

When v V c, Eq. (3.119) reduces to the classical limit, i.e., p mv. It can be seen that for a particle with nonzero mass, its momentum would increase as v S c without any bound. There is no way we could accelerate a particle to the speed of light. If there is anything that travels with the speed of light, it has to be massless, i.e., m 0. An example of massless particles is the light quanta or photons. The kinetic energy can be evaluated by integrating the work needed to accelerate a particle, i.e., x dp v dp x x dp dv dx 3 dx 3 vdv EK 3 Fdx 3 0 dv dt 0 dv 0 0 dt

Using Eq. (3.119), we find that EK

mc2 21 v2/c2

mc2

(3.120)

When v V c, we have 1/!1 v2/c2 < 1 v2/2c2 so that EK mv2/2 p2/2m in the low-speed limit. In the relativistic limit, however, EK will be on the order of mc2. Because energy is additive, the total energy of a moving free particle is E EK E0

mc2 21 v2/c2

(3.121)

Obviously, the energy of a particle would become infinite if its speed approaches the speed of light, unless its mass goes to zero. It can be shown that E2 E 20 m2c4/(1 v2/c2) m2c4 p2c2, where p is given in Eq. (3.119). This gives another expression of energy in terms of the rest energy, the momentum, and the speed of light as follows: E2 m2c4 p2c2

(3.122)

It should be noted that, in general, pc is not equal to the kinetic energy. For v V c, the total energy is approximately the same as the rest energy. Comparing Eq. (3.119) and Eq. (3.121), we notice that E pc(c/v). Therefore, when v S c, we see that E S pc (which is unbounded unless m 0). For a photon that travels at the speed of light, in order for the above equations to be meaningful, we must set its mass to zero. From Eq. (3.122), we have for photons that hn E p c c

(3.123)

which is the same as Eq. (3.7) in Sec. 3.1.3. By noting that ln c, we obtain h lp

(3.124)

The kinetic energy of a photon is pc or hn since its rest energy is zero. One should not 1 attempt to calculate the kinetic energy of a photon by 2mc2, because photons are not only massless but also relativistic particles, for which the energy and momentum must be evaluated according to the above mentioned equations. While photons do not have mass, it has been observed that photons can be used to create particles with nonzero mass or vice versa, as in creation or annihilation reactions. High energy physics has proven that mass is not always conserved. Furthermore, energy and mass can be interconverted. A small amount of mass can be converted into a large amount of energy, as in a nuclear reaction.

96

CHAPTER 3

3.8 SUMMARY This chapter started with very basic independent particle systems to derive the three major statistics, i.e., the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac statistics. The classical and quantum statistics were then applied to thermodynamic systems, providing microscopic interpretations of the first, second, and third laws of thermodynamics, as well as Bose-Einstein condensate. The velocity distribution and specific heat of ideal gases were explained based on the semi-classical statistics, followed by a brief description of quantum mechanics to understand the quantization of translational, rotational, and vibrational modes. The fundamental emission and absorption processes of molecules or atoms were discussed along with the concept of stimulated emission. Finally, matter-energy conversion was described within the framework of the relativistic theory. While most of the explanations in this chapter are semi-classical and somewhat oversimplified, it should provide a solid background to those who did not have a formal education in statistical mechanics and quantum physics. These materials will be frequently referenced in the rest of the book.

REFERENCES 1. C. L. Tien and J.H. Lienhard, Statistical Thermodynamics, Hemisphere, New York, 1985. 2. R. E. Sonntag and G.J. van Wylen, Fundamentals of Statistical Thermodynamics, Wiley, New York, 1966. 3. J. E. Lay, Statistical Mechanics and Thermodynamics of Matter, Harper Collins Publishers, New York, 1990. 4. C. E. Hecht, Statistical Thermodynamics and Kinetic Theory, W. H. Freeman and Company, New York, 1990. 5. V. P. Carey, Statistical Thermodynamics and Microscale Thermophysics, Cambridge University Press, Cambridge, UK, 1999. 6. F. C. Chou, J. R. Lukes, X. G. Liang, K. Takahashi, and C. L. Tien, “Molecular dynamics in microscale thermophysical engineering,” Annu. Rev. Heat Transfer, 10, 144–176, 1999. 7. S. Maruyama, “Molecular Dynamics Method for Microscale Heat Transfer,” in Advances in Numerical Heat Transfer, W. J. Minkowycz and E. M. Sparrow (eds.), Vol. 2, pp. 189–226, Taylor & Francis, New York, 2000. 8. D. J. Griffiths, Introduction to Quantum Mechanics, 2nd ed., Prentice Hall, New York, 2005. 9. H. J. Metcalf and P. van der Straten, Laser Cooling and Trapping, Springer, New York, 1999. 10. G. Burns, High-Temperature Superconductivity: An Introduction, Academic Press, Boston, MA, 1992. 11. A. Einstein, “Zur quantentheorie der strahlung,” Phys. Z., 18, 121–128, 1917; English translation in Sources of Quantum Mechanics, B. L. Van der Waerden (ed.), North-Holland Publishing Company, Amsterdam, the Netherlands, 1967. 12. H. P. Baltes, “On the validity of Kirchhoff’s law of heat radiation for a body in a nonequilibrium environment,” Progress in Optics, 13, 1–25, 1976. 13. J. P. Gordon, H. J. Zeiger, and C. H. Townes, “The maser—New type of microwave amplifier, frequency standard, and spectrometer,” Phys. Rev., 99, 1264–1274, 1955; A. L. Schawlow and C. H. Townes, “Infrared and optical masers,” Phys. Rev., 112, 1940–1949, 1958. 14. R. Wolfson and J. M. Pasachoff, Physics with Modern Physics for Scientists and Engineers, 3rd ed., Addison-Wesley, Reading, MA, 1999.

PROBLEMS For a rectangular prism (i.e., a cuboid) whose three sides are x, y, and z if x y z 9, find the values of x, y, and z so that the volume of the prism is maximum.

3.1.

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

97

3.2. Make a simple computer program to evaluate the relative error of Stirling’s formula: ln x! < x ln x x for x 10, 100, and 1000. 3.3. For each of the following cases, determine the number of ways to place 25 books on 5 shelves (distinguishable by their levels). The order of books within an individual shelf is not considered. (a) The books are distinguishable, and there is no limit on how many books can be put on each shelf. (b) Same as (a), except that all the books are the same (indistinguishable). (c) The books are distinguishable, and there are 5 books on each shelf. (d) The books are distinguishable, and there are 3 books on the 1st shelf, 4 on the 2nd, 5 on the 3rd, 6 on the 4th, and 7 on the 5th. 3.4. For each of the following cases, determine the number of ways to put 4 books on 10 shelves (distinguishable by their levels). Disregard their order on each shelf. (a) The books are distinguishable, and there is no limit on how many books you can place on each shelf. (b) Same as (a), but there is a maximum of 1 book on any shelf. (c) Same as (a), except that the books are identical (indistinguishable). (d) Same as (b), except that the books are identical. 3.5. A box contains 5 red balls and 3 black balls. Two balls are picked up randomly. Determine the following: (a) What’s the probability that the second ball is red? (b) What’s the probability that both are red? (c) If the first one is black, what is the probability that the second is red? 3.6. Suppose you toss two dice, what’s the probability of getting a total number (a) equal to 5 and (b) greater than 5? 3.7. Draw 5 cards from a deck of 52 cards. (a) What is the probability of getting a royal flush? (b) What is the probability of getting a full house? [A royal flush is a hand with A, K, Q, J, and 10 of the same suit. A full house is a hand with three of one kind and two of another (a pair).] 3.8. For a Gaussian distribution function, f (x) a exp[(x m)2], where a and m are positive constants. (a) Find the normalized distribution function F(x). (b) Show that the mean value x m. (c) Determine the variance uvar and the standard deviation s. 3.9. The speed distribution function for N particles in a fixed volume is given by f (V) AV(B V)/B3, where V ( 0) is the particle speed, and A and B are positive constants. Determine: (a) The probability density function F(V). (b) The number of particles N in the volume. (c) The minimum speed Vmin and maximum speed Vmax. (d) The most probable speed where the probability density function is the largest. (e) The average speed V and the root-mean-square average speed V 2V 2. rms 3.10. Six bosons are to be placed in two energy levels, each with a degeneracy of two. Evaluate the thermodynamic probability of all arrangements. What is the most probable arrangement? 3.11. Four fermions are to be placed in two energy levels, each with a degeneracy of four. Evaluate the thermodynamic probability of each arrangement. What is the most probable arrangement? 3.12. Derive the Fermi-Dirac distribution step by step. Clearly state all assumptions. Under which condition, can it be approximated by the Maxwell-Boltzmann distribution? 3.13. What is the Boltzmann constant and how is it related to the universal gas constant? Show that the ideal gas equation can be written as P nkBT . What is the number density of air at standard conditions (1 atm and 25C)? 3.14. How many molecules are there per unit volume (number density) for a nitrogen gas at 200 K and 20 kPa? How would you estimate the molecular spacing (average distance between two adjacent molecules)? 3.15. Use Eq. (3.28a) and 1/T ('S/'U)V,N to show that b 1/kBT . 3.16. Show that b 1/kBT and a m/kBT for all the three statistics. [Hint: Follow the discussion

in Sec. 3.2 with a few more steps.] 3.17. Consider 10 indistinguishable particles in a fixed volume that obey the Bose-Einstein statistics. There are three energy levels with e0 0.5 eu, e1 1.5 eu, and e2 2.5 eu, where “eu” refers to a certain energy unit. The degeneracies are g0 1, g1 3, and g2 5, respectively.

CHAPTER 3

98

(a) If the degeneracy were not considered, in how many possible ways could you arrange the particles on the three energy levels? (b) You may notice that different arrangements may result in the same energy. For example, both the arrangement with N1 9, N2 0, N3 1 and the arrangement with N1 8, N2 2, N3 0 yield an internal energy U 7 eu. How many arrangements are there with U 9 eu? Calculate the thermodynamic probability for all macrostates with U 9 eu. (c) The ground state refers to the state corresponding to the lowest possible energy of the system. Determine the ground-state energy and entropy. What is the temperature of this system at the ground state? (d) How many microstates are there for the macrostate with U 25 eu? 3.18. Consider a system of a single type of constituents, with N particles (distinguishable from the statistical point of view) and only two energy levels e0 0 and e1 e (nondegenerate). (a) What is the total number of microstates in terms of N. How many microstates are there for the macrostate that has energy U (N 1)e? Show that the energy of the most probable macrostate is Ne/2. (b) What are the entropies of the states with U 0 and U (N 1)e. Sketch S as a function of U. Comment on the negative temperature, 1/T ('S/'U)V,N 0. Is it possible to have a system with a negative absolute temperature? 3.19. A system consists of six indistinguishable particles that obey Bose-Einstein statistics with two energy levels. The associated energies are e0 0 and e1 e, and the associated degeneracies are g0 1 and g1 3. Answer the following questions: (a) How many possible macrostates are there? How many microstates corresponding to the macrostate with three particles on each energy level? (b) What is the most probable macrostate, and what are its corresponding energy U and thermodynamic probability ? (c) Show that at 0 K, both the energy and the entropy of this system are zero. Also, show that for this system the entropy increases as the energy increases. 3.20. From the Sackur-Tetrode equation, show that s2 s1 cp ln (T2/T1) R ln (P2/P1). 3.21. Write U, p, A, and S in terms of the partition function Z. Express H and G in terms of the partition function Z. For an ideal monatomic gas, express H and G in terms of T and P. 3.22. For an ideal diatomic gas, the partition function can be written as Z Zt Zr Zv Ze ZD, where Ze ge0 is the degeneracy of the ground electronic level, and ZD exp (D0/kBT) is the chemical partition function that is associated with the reaction of formation. Here, ge0 and D0 can be regarded as constants for a given material. Contributions to the partition function beside the translation are due to internal energy storage and thus are called the internal contribution, i.e., Zint = ZrZvZeZD. Find the expressions of U, P, A, S, H, and G in terms of N, T, and P (or V) with appropriate constants, assuming that the temperature T W r and is comparable with v. 3.23. For an ideal molecular gas, derive the distribution function in terms of the kinetic energy e mv2/2, i.e., f(e). 3.24. Prove Eq. (3.48), Eq. (3.49a) and Eq. (3.50). 3.25. Evaluate and plot the Maxwell speed distribution for Ar gas at 100, 300, and 900 K. Tabulate the average speed, the most probable speed, and the rms speed at these temperatures. 3.26. A special form of the Euler-Maclaurin summation formula is `

1 1 1 (3) 1 (5) a f (j) 3 f (x)dx 2 f (a) 12 f r(a) 720 f (a) 30,240 f (a) c ja a `

Consider the rotational partition function, ` r Zr a (2j 1) exp c j( j 1) d T j0

and show that 2

Z < r

which is Eq. (3.55) for s 1.

1 r 1 r T c1

a b cd 3 T 15 T r

ELEMENTS OF STATISTICAL THERMODYNAMICS AND QUANTUM THEORY

99

3.27. Because of the nuclear spin degeneracy, hydrogen H2 gas is consistent of two different types:

ortho-hydrogen and para-hydrogen. The rotational partition functions can be written, respectively, as Zr,ortho 3

(2l 1)exp c l(l 1)

a

l0,2,4c

and

Zr,para

a

(2l 1) exp c l(l 1)

l1,3,5c

so that

Zr,H2 3

a

(2l 1) exp c l(l 1)

l0,2,4c

r d

T

a

r d T

r d T

(2l 1) exp c l(l 1)

l1,3,5c

r d. T

Evaluate the temperature-dependent specific heat of each of the two types of hydrogen, which can be separated and stay separated for a long time before the equilibrium distribution is restored. Calculate the specific heat of hydrogen in the equilibrium distribution as a function of temperature. The ratio Zr,ortho/Zr,para is the same as the equilibrium ratio of the two types and varies from 0 at very low temperatures to 3 near room temperature. 3.28. Calculate the specific heat and the specific heat ratio g cp /cv for nitrogen N2 at 30, 70, 300, and 1500 K. Assume the pressure is sufficiently low for it to be an ideal gas. 3.29. Calculate the specific heat and the specific heat ratio g cp /cv for oxygen O2 at 50, 100, 300, and 2000 K. Assume the pressure is sufficiently low for it to be an ideal gas. 3.30. Estimate the mole and mass specific heats of CO gas at 100, 300, and 3000 K. Show in a specific heat versus temperature graph the contributions from different modes. 3.31. (a) How many rotational degrees of freedom are there in a CO2 molecule and in a H2O molecule? (b) If the temperature of a low-pressure CO2 gas is raised high enough to completely excite its rotational and vibrational modes, what will be its specific heats cv and cp? Express your answer in both kJ/(kg K) and kJ/(kmol K). 3.32. Compute and plot the temperature-dependent specific heat for the following ideal gases and compare your results with tabulated data or graphs: (a) CO2, (b) H2O, and (c) CH4. 3.33. Do a literature search to discuss the following topics: (a) the significance of partition functions, (b) the different types of statistical ensembles, and (c) statistical fluctuations. 3.34. We have discussed the translational degeneracy dg in a 3-D space with a volume V, as given in Eq. (3.85). Consider the situation when the particle is confined in a 2-D square potential well. Find the proper wavefunctions and the energy eigenvalues. Assuming the area A is very large, find the translational degeneracy dg in terms of A, m, e, and de. 3.35. Estimate the speed an electron needs in order to escape from the ground state of a hydrogen atom. What is the de Broglie wavelength of the electron at the initial speed? If a photon is used to knock out the electron in the ground state, what would be the wavelength of the photon? Why is it inappropriate to consider the electron movement in an atom as an analogy to the movement of Mars in the solar system? 3.36. For the harmonic oscillator problem discussed in Sec. 3.5.5. Show that Eq. (3.111) is a solution for Eq. (3.107) for n 0, 1, and 2. Plot 20, 21, and 22 to discuss the differences between classical mechanics and quantum mechanics.

This page intentionally left blank

CHAPTER 4

KINETIC THEORY AND MICRO/NANOFLUIDICS

Statistical mechanics involves determination of the most probable state and equilibrium distributions, as well as evaluation of the thermodynamic properties in the equilibrium states. Kinetic theory deals with the local average of particle properties and can be applied to nonequilibrium conditions to derive transport equations.1–7 Kinetic theory, statistical mechanics, and molecular dynamics are based on the same hypotheses; they are closely related and overlap each other in some aspects. Knowledge of kinetic theory is important to understanding gas dynamics, as well as electronic and thermal transport phenomena in solid materials. In this chapter, we first introduce the simple kinetic theory of ideal gases based on the mean-free-path approximation. While it can help us obtain the microscopic formulation of several familiar transport equations and properties, the simple kinetic theory is limited to local equilibrium and, hence, is good only for time durations much longer than the mechanistic timescale, called the relaxation time. The advanced kinetic theory is based on the Boltzmann transport equation (BTE), which will also be presented in this chapter. The BTE is an integro-differential equation of the distribution function in terms of space, velocity, and time. It takes into account changes in the distribution function caused by external forces and collisions between particles. Many macroscopic phenomenological equations, such as Fourier’s law of heat conduction, the Navier-Stokes equation for viscous flow, and the equation of radiative transfer for photons and phonons, can be derived from the BTE, under the assumption of local equilibrium. Finally, in the last section of this chapter, we present the application of kinetic theory to the flow of dilute gases in micro/nanostructures and the associated heat transfer. The application of kinetic theory to heat conduction in metals and dielectrics will be discussed in forthcoming chapters.

4.1 KINETIC DESCRIPTION OF DILUTE GASES In this section, we will introduce the simple kinetic theory of ideal molecular gases. The purpose is to provide a step-by-step learning experience leading to more advanced topics. There are several hypotheses and assumptions in kinetic theory of molecules. • Molecular hypothesis: Matter is composed of small discrete particles (molecules or atoms); any macroscopic volume contains a large number of particles. At 25°C and 1 atm, 1-m3 space of an ideal gas contains 27 million molecules.

101 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

CHAPTER 4

102

• Statistic hypothesis: Time average is often used since any macroscopic observation takes much longer than the characteristic timescale of molecular motion (such as the average time lapse between two subsequent collisions of a given molecule). • Kinetic hypothesis: Particles obey the laws of classical mechanics. • Molecular chaos: The velocity and position of a particle are uncorrelated. The velocities of any two particles are not correlated. • Ideal gas assumptions: Molecules are rigid spheres resembling billiard balls. Each molecule has a diameter d and a mass m. All collisions are elastic and conserve both energy and momentum. Molecules are widely separated in space (i.e., a dilute gas). Intermolecular forces are negligible except during molecular collisions. The duration of collision is negligible compared with the time between collisions. No collision can occur with more than two particles. The general molecular distribution function is f(r, v, t), which is a function of space, velocity, and time. The distribution function gives the particle (number) density in the phase space at any time. Therefore, the number of particles in a volume element of the phase space is dN f(r, v, t)dxdydzdvx dvy dvz f(r, v, t)dVdv

(4.1)

where we have used v for the velocity space (dv dvx dvy dvz). Integrating Eq. (4.1) over the velocity space gives the number of particles per unit volume, or the number density, as n(r, t)

dN 3 f(r, v, t)dv dV v

(4.2)

Note that the density is r(r, t) m ⋅ n (r, t), where m is the mass of a particle. The total number of particles inside the volume V as a function of time is then N(t) 6 f(r, v, t)dVdv

(4.3)

V,v

In a thermodynamic equilibrium state, (4.4)

f(r, v, t) f(v)

which is independent of space and time. Any intensive property will be the same everywhere. 4.1.1 Local Average and Flux Let c c(r, v, t) be any additive property of a single molecule, such as kinetic energy and momentum. Note that c may be a scalar or a vector. The local average or simply the average of the property c is defined as 1 c 3 f c dv^ 3 f dv n 3 f c dv

(4.5)

v

v

v

which is a function of r and t. The ensemble average is the average over the phase space, i.e., c

1 f c dV dv N6 V,v

For a uniform gas, the local average and the ensemble average are the same.

(4.6)

KINETIC THEORY AND MICRO/NANOFLUIDICS

103

The transfer of c across an area element dA per unit time per unit area is called the flux of c. As shown in Fig. 4.1, particles having velocities between v and v dv that will pass

FIGURE 4.1 Illustration of the flux of particles and quantities through a surface.

through the area dA in the time interval dt must be contained in the inclined cylinder, whose volume is dV vdt cosu dA v ⋅ n dAdt. It is assumed that dt is sufficiently small such that particle-particle collisions can be neglected. The number of particles with velocities between v and v dv within the inclined cylinder can be calculated by (4.7)

f(r, v, t)dVdv f(r, v, t) v # n dAdtdv The flux of the property c is then cf(r, v, t) v # n dvdAdt dAdt Integrating over all velocities yields the total flux of c: flux of c within dv

Jc 3 c f v # n dv

(4.8)

v

Equation (4.8) gives the net flux since it is evaluated for all ., or over a solid angle of 4p in the spherical coordinates. Very often the integration is performed over the hemisphere with v n v cosu 0 for positive flux or v ⋅ n v cosu 0 for negative flux. When c 1, Eq. (4.8) gives the particle flux: JN 3 f v # n d.

(4.9)

.

In an equilibrium state, this integration can be evaluated using the spherical coordinates. Noting that v ⋅ n v cosu and f f(v), which is independent of the direction (isotropic), we can obtain the particle flux in the positive z direction by integrating over the hemisphere in the velocity space, i.e., `

2p p/2

`

JN 3 3 3 f(v)v3 cos u sin ududfdv p 3 f(v)v3dv v0 f0 u0

0

(4.10)

CHAPTER 4

104

In writing Eq. (4.10), we have kept the vector variable in f(v) to signify that it is a velocity distribution. One should bear in mind that the last expression is based on the fact that f(v) is not a function of u and f. For an ideal molecular gas, f(v) is given by the Maxwell velocity distribution, i.e., Eq. (3.43) in Chap. 3. If the integration in Eq. (4.10) is performed over the whole sphere with u from 0 to p, we would obtain the net flux of particles, which is zero in the equilibrium case. The average speed can be evaluated using Eq. (4.5); hence, `

4p 1 1 v n 3 f(v)vdv n 9 f(v)v3 sin ududfdv n 3 f(v)v3dv 0

v

(4.11)

v,f,u

Here, we have assumed an isotropic distribution function to obtain the last expression. The above equation is evaluated over the solid angle of 4p to obtain the average of all velocities. Comparing Eq. (4.10) and Eq. (4.11), we can see that nv (4.12a) JN 4 For an ideal gas, since f(v) is given by the Maxwell velocity distribution, Eq. (3.44), we obtain JN

kBT nv n Å 4 2pm

(4.12b)

Because each particle has the same mass, the mass flux is given by rv Jm m 3 f v # n dv 4 v

(4.13)

Substituting c mv2/2 into Eq. (4.8), one obtains the kinetic energy flux JKE. In an equilibrium state with an isotropic distribution, the kinetic energy flux in the positive z direction ` is JKE (pm/2) 10 f (v)v5dv, whereas the net kinetic energy flux is zero. Note that Eq. (4.8) is a general equation that is also applicable to nonequilibrium and anisotropic distributions. When c mv, the momentum flux is a vector, which is often handled by considering individual components. Note that the rate of transfer of momentum across a unit area is equal to the force that the area must exert upon the gas to sustain the equilibrium. Furthermore, the surface may be projected to three orientations, yielding a nine-component tensor in the momentum flux: Pij 3 (mvj)fvi dv ,

i, j 1, 2, 3

(4.14a)

v

Here, (v1,v2,v3) and (vx,vy,vz) are used interchangeably. Let P rv2i , which is always positive, and tij rvjvi for i ⬆ j and 0 for i j. We can rewrite the above equation as Pij nmvjvi rvjvi Pdij tij

(4.14b)

where dij is the Kronecker delta, which is equal to 1 when i j and 0 when i 2 j. It can be seen that P is the normal stress or static pressure and tij (i 2 j) is the shear stress, which is zero in a uniform, stationary gas (without bulk motion). Notice that the velocity distribution in the

KINETIC THEORY AND MICRO/NANOFLUIDICS

105

vicinity of the wall is the same as that away from the wall because of the reflection by the wall. The pressure is now related to the momentum flux, i.e., 3P r(v2x v2y v2z ) rv2, or 1 2 P r 3v

(4.15)

which is Boyle’s law. Compared with the ideal gas equation, the right-hand side must be related to temperature. In kinetic theory, temperature is associated to the mean translational kinetic energy of the molecule, i.e., 3 1 1 1 1 k T mv2 mv2x mv2y mv2z 2 B 2 2 2 2

(4.16)

We have derived this equation from statistical mechanics in Chap. 3. The temperature defined based on the kinetic energy of the particles is sometimes referred to as the kinetic temperature. Combining Eq. (4.15) and Eq. (4.16), we get the ideal gas equation, P nkBT , as expected. From the above discussion, one can see clearly how the macroscopic properties such as pressure and temperature are related to the particle distribution function. For ideal gases at equilibrium, we have derived the Maxwell velocity and speed distributions in Chap. 3. Example 4-1. Show that P rv2n, where vn is the velocity component normal to the wall, and P rv2/3 for equilibrium distribution. Solution. Consider the horizontal plane shown in Fig. 4.1 as the wall, below which is a gas in equilibrium. Multiplying Eq. (4.7) by mv gives the momentum of the particles with velocities between v and v dv, impinging on the wall: mv f(v)v # n dAdtdv , which of course is equal to the impulse on the wall: dFdt. The normal component vn v # n v cos u contributes to an impulse on the wall: mv2n f(v)dAdtd., that is always positive regardless of the sign of vn. However, the contributions of all parallel components cancel out due to isotropy. The pressure can be evaluated by integrating over all velocities, i.e., P 1. mv2n f(v) d. mnv2n rv2n. We have used the definition of local average given by Eq. (4.5). If the distribution is isotropic, then ` ` 2p p P m 10 10 10 f(v)v4cos2 u sinududfdv (4pm/3) 10 f(v)v4 dv since vn v cos u. Compared ` 2p p ` with v2 (1/n) 10 10 10 f(v)v 4sinu dudfdv (4p/n) 10 f(v)v4 dv, we obtain P mnv2/3 rv2/3. The distribution function is uniform inside the container; hence, the wall may be a physical wall or merely an imaginary one since pressure exists everywhere in the fluid.

4.1.2 The Mean Free Path The mean free path, defined as the average distance the particle travels between two subsequent collisions, is a very important concept. It is often used to determine whether a given phenomenon belongs to the macroscale (continuum) regime or otherwise falls in the microscale regime when the governing equations derived under the assumption of local equilibrium break down. One of the applications is in microfluidics, to be discussed later in this chapter, and another is in the electrical and heat conduction in solids, which will be studied in Chap. 6. Consider the case in Fig. 4.2: a particle of diameter d moving at an average velocity v (assuming all other particles are at rest). During a time interval dt, the volume swept by the particle within d from the centerline is dV pd 2vdt. The ndV particles, whose centers are inside this volume element, will collide with the moving particle. Therefore, the frequency of collisions, i.e., number of collisions per unit time is pnd 2v. The time between two subsequent collisions, t, is the inverse of the frequency of collision. The mean free path is

CHAPTER 4

106

FIGURE 4.2 Schematic used for a simple derivation of the mean free path.

the average distance that a particle travels between two subsequent collisions and is equal to the ratio of the average velocity to the frequency of collision. Therefore, vt < (pnd 2)1

(4.17)

and depends only on the particle size and the number density. The average time between two subsequent collisions t is termed the relaxation time, and the average frequency of collision t1 is the scattering rate or collision rate. The scattering rate is the average number of collisions an individual particle experiences per unit time. For electrons whose diameters are negligible compared with that of the other particles that scatter them, the mean free path is electron(or photon)

1 nAc

(4.18)

where Ac is the scattering cross-sectional area and n is the number density of the scatter, such as phonons or defects. Equation (4.18) also applies for the case of photons that can be scattered by particles, such as molecules in the atmosphere. The photon mean free path is also called the radiation penetration depth, as will be discussed in Chap. 8. When the relative movement of particles is considered based on the Maxwell velocity distribution, Eq. (4.17) is modified slightly for an ideal gas as follows:

23

(7.7)

Equation (7.7) relates the speed of the temperature wave to the speed of sound in an insulator. The square root of three can be understood as due to the randomness of thermal fluctuations in a 3-D medium, just like the relation between the velocity and its components, v2 v2x v2y v2z , in kinetic theory. Equation (7.5) indeed sets a limit on the heat propagation speed, which is manifested by a sharp wavefront that travels at vtw inside the medium for a sudden temperature change at the boundary. As a wave equation, the solution is a temperature field with both an amplitude and a phase. Theoretically, the temperature wave can be reflected by another boundary and can interfere, constructively or destructively, with a forward propagating wave. The interaction between the temperature waves may also result in a resonance effect, a typical wave phenomenon. Numerous analytical and numerical predictions have been made, as referenced in the work of Özi¸sik and Tzou,4 along with Yeung and Lam,5 Haji-Sheikh et al.,6 and Gembarovic and Gembarovic, Jr.7 It should be noted that the terms heat wave3 and thermal wave4 have also been frequently used in the literature to describe the temperature wave behavior. The term “temperature wave” is used in this chapter for the wavelike behavior associated with the hyperbolic-type heat equations, because “heat wave” might be confused with the calamitous weather phenomenon and “thermal wave” might be confused with the diffusion wave used in photoacoustic techniques. Bennett and Patty (Appl. Opt., 21, 49, 1982) clarified: The term thermal wave interference is used to mean the superposition of simple harmonic solutions of the thermal diffusion equation. Although wavelike in nature there are important differences between thermal waves arising from a differential equation that is of the first order in time and waves that are solution to a wave equation that is of the second order in time. In the heat transfer literature, thermal wave often refers to periodic-heating techniques used widely for thermophysical property measurements. Let us consider an example of a semi-infinite solid under a constant heat flux at the surface. Figure 7.2 illustrates the solutions for a small tp and a large tp, compared with t. Here again, we have assumed tq t. The propagation speed is equal to vtw, and the pulse wavefront is given by x1 vtwt1 and x2 vtwt2. Hence, x1 x2 , where vgt is the mean free path. In the case of a short pulse, the temperature pulse propagates and its height decays by dissipating its energy to the medium as it travels. The parabolic heat equation, on the other hand, predicts a continuous temperature distribution without any wavefront (see Fig. 7.2). As time passes on, the first-order time derivative, or the diffusion term, in Eq. (7.5) dominates. If the relative change of 'T/'t or qs during one tq is large, then the wave feature is important. This should happen immediately after a sudden thermal disturbance that results in a temporal nonequilibrium, as well as a spatial nonequilibrium near the heat pulse or the wavefront. After a sufficiently long time, usually 5 to10 times tq, a local equilibrium will be reestablished, and the thermal field can be described by the parabolic heat equation. At steady state, the hyperbolic and parabolic equations predict the same results. While Eq. (7.4) is mathematically more general than the heat diffusion equation, it should not be taken as

CHAPTER 7

252

FIGURE 7.2 (Not to scale) Illustration of the solution of the hyperbolic heat equation at short timescales. (a) A short pulse, tp V t . (b) A long pulse, tp t . The solid curves are the solutions of the hyperbolic heat equation, Eq. (7.5), and the dash-dotted and dashed curves are the solutions, calculated from Eq. (7.1), obtained from the heat diffusion equation.

a correction, or a more realistic theory than the Fourier conduction model, because the Cattaneo equation has not been justified on a fundamental basis, nor has it been validated by any plausible experiments. Many researchers have investigated the hyperbolic heat equation based on the second law of thermodynamics.8–10 It has been found that the hyperbolic heat equation sometimes predicts a negative entropy generation and even allows energy to be transferred from a lower-temperature region to a higher-temperature region. The entropy generation rate for heat conduction without an internal source can be calculated by10 'qs . 1 1 sgen 2 qs # =T 2 qs # aqs tq b 't T T

(7.8)

The above equation was obtained by setting the energy and entropy balances as follows: r

'u = # qs 't

qs 's . and r 't = # a T b sgen

(7.9)

Note that du Tds. A negative entropy generation can easily be numerically demonstrated from Eq. (7.5) during the temperature wave propagation. Here, a negative entropy generation does not constitute a violation of the second law of thermodynamics because the concept of “temperature” in the hyperbolic heat equation cannot be interpreted in the conventional sense due to the lack of local thermal equilibrium. Extended irreversible thermodynamics has been proposed by Jou et al. by modifying the definition of entropy such that it is not a property of the system anymore but depends on the heat flux vector.11 The theory of extended irreversible thermodynamics is self-consistent but has not been experimentally validated; hence, it cannot be taken as a generalized thermodynamic theory. Similarly, the hyperbolic heat equation should not be treated as a more general theory over Fourier’s heat conduction theory. Derive the modified Fourier equation, or the Cattaneo equation, based on the BTE under the relaxation time approximation.

Example 7-2.

Solution. Tavernier (C. R. Acad. Sci., 254, 69, 1962) first showed that the Cattaneo equation could be derived for phonons and electrons using the relaxation time approximation of the BTE. Let us first review Sec. 4.3.2, where we have derived Fourier’s law based on the BTE. Again, let us start

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

253

by assuming that the temperature gradient is in the x direction only. The transient 1-D BTE under the relaxation time approximation can be written as follows: f0 f 'f 'f

vx t 't 'x A further assumption is made such that 'f /'x < 'f0 /'x ('f0 /'T )('T/'x), which is exactly the condition of local equilibrium. Multiplying the earlier equation by tevx and then integrating each term over the momentum space, we obtain by noting 1. evx f0 d. 0 that

or

'f0 'T ' ed. 3 vx fed. tv fed. 3 tv2x 't 3. x 'T 'x . .

(7.10a)

'qsx 'T k qsx 't 'x

(7.10b)

t

which can be generalized to the 3-D case as given in Eq. (7.2), after replacing t with tq.

The derivation given in this example, however, does not provide a microscopic justification of the hyperbolic heat equation, because it is strictly valid only under the local-equilibrium assumption with an averaged relaxation time. The local-equilibrium assumption prohibits application of the derived equation to length scales comparable or smaller than the mean free path.12 Suppose a thermal disturbance occurs at a certain time and location; after a duration of time that is much longer than the relaxation time, the Fourier law and the parabolic heat equation are well justified because both the spatial and temporal local-equilibrium conditions are met. On the other hand, if we wish to use the modified Fourier equation to study the transient behavior at a timescale less than t, then the disturbance will propagate by a distance shorter than the mean free path, as shown in Fig. 7.2. Therefore, the derivation based on the BTE, under local-equilibrium and relaxation time approximations, is not a microscopic proof of the hyperbolic heat equation, which is meaningful only in a nonequilibrium situation. To this end, it appears that Maxwell made the right choice in dropping terms involving the relaxation time in the paper (Phil. Trans. R. Soc. London, 157, 49, 1867), by assessing that the rate of conduction will rapidly establish itself. While the previous derivation does not support Eq. (7.2), it does not disprove Eq. (7.2) either because the relaxation time approximation is not a very good model in the nonequilibrium regime. The local-equilibrium assumption breaks down completely at extremely short timescales. The basic assumption in the relaxation time approximation is that the distribution function is not too far from equilibrium. For a heat pulse with a duration less than t, the relaxation time approximation should generally be applied when t t, regardless of whether we are dealing with a thin film or a semi-infinite medium. What may be concluded is that we have failed to prove either by any fundamental theory or by any credible experiments that the Cattaneo equation, originated from the kinetic theory according to the relaxation time approximation, is a physical law that extends Fourier’s law to the nonequilibrium regime. Atomistic simulations, based on molecular dynamics and the lattice Boltzmann method, have provided further evidence that the hyperbolic heat equation is not applicable at very short timescales or in the nonequilibrium regime, where the applicability of the relaxation time approximation is also questionable.13,14 For this reason, we have intentionally avoided phrases like “generalized Fourier’s equation” and “modified Fourier’s law” in describing Eq. (7.2). One might argue that when tq was identified as the average time t between collisions, under the relaxation time approximation, Eq. (7.7) could give the appropriate heat propagation speed, which is one-third of the speed of sound, as observed in liquid helium and some solids at low temperatures. This is a misinterpretation because the phenomenon, related to the second sound with a characteristic speed v2nd vg > !3, cannot occur by a single relaxation mechanism, as will be shown later. Nevertheless, after some modifications, there

CHAPTER 7

254

exist special cases when the modified heat equation becomes physically plausible and practically applicable. The modified equation does not produce sharp wavefronts like those illustrated in Fig. 7.2.

7.1.2 Dual-Phase-Lag Model Chester (Phys. Rev., 131, 2013, 1963) first explained the lagging behavior associated with the Cattaneo equation. He pointed out that the physical significance of the modified Fourier equation lies in that there exists a finite buildup time after a temperature gradient is imposed on the specimen for the onset of a heat flow, which does not start instantaneously but rather grows gradually during the initial period on the order of the relaxation time t. Conversely, if the thermal gradient is suddenly removed, there will be a lag in the disappearance of the heat current. Gurtin and Pipkin (Arch. Ration. Mech. Anal., 31, 113, 1968) introduced the memory effect to account for the delay of the heat flux with respect to the temperature gradient. They expressed the heat flux as an integration of the temperature gradient over time, in analogy with the stress-strain relationship of viscoelastic materials with instantaneous elasticity. The linearized constitutional equation reads qs(r, t) 3

t

K(t tr)=T(r, tr)dtr

(7.11)

`

where K(j) is a kernel function. When K(j) kd(j), Eq. (7.11) reduces to Fourier’s law; when K(j) (k/tq)ej/t, Eq. (7.11) reduces to the Cattaneo equation. By assuming k1 (7.12) K(j) k0d(j) t ej/t q Joseph and Preziosi showed that the heat flux can be separated into two parts:3 k1 t t tr qs(r, t) k0=T t 3 exp a t b=T(r, tr)dtr q q

(7.13a)

`

Hence,

qs tq

'qs ' k=T tqk0 =T 't 't

(7.13b)

where k k0 k1 is the steady-state thermal conductivity, as can be seen from Eq. (7.13a). Combined with Eq. (7.3), the heat equation becomes a partial differential equation of the Jeffreys type, tq '2T ' 1 'T (7.14) =2T tT =2T a

a 2 't 't 't where tT tqk0 /k is known as the retardation time.3 The Jeffreys equation was originally developed in the early twentieth century to relate deformation with stress in the earth’s mantle. Unless tT 0 or k0 0, Eq. (7.14) maintains the diffusive feature and produces an instantaneous response, albeit small, throughout the medium for an arbitrary thermal disturbance. In a series of papers published in the early 1990s, Tzou extended the lagging concept to a dual-phase-lag model, as described in his monograph published in 1997.15 He started with the assumption that qs(r, t tq) k=T(r, t tT)

(7.15)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

255

The introduction of a delay time tT in Eq. (7.15) implies the existence of a lag in the temperature gradient, with respect to the heat flux driven by an internal or external heat source. The rationale of the phenomenological equation given in Eq. (7.15) was that, in some cases, the heat flux might be viewed as the result of a preceding temperature gradient; in other cases, the temperature gradient might be viewed as the result of a preceding heat flux. The heat flux and the temperature gradient can switch roles in the relationship between “cause” and “effect.” Moreover, both lags might occur simultaneously in certain materials under dramatic thermal disturbances, such as during short-pulse laser heating.4,15 These primitive arguments should not be scrutinized rigorously; rather, they are merely thinking instruments to help us gain an intuitive understanding of the heat flux and temperature gradient relationship. After applying the Taylor expansion to both sides of Eq. (7.15) and using the first-order approximation, one immediately obtains qs tq

'qs ' k=T tT k =T 't 't

(7.16)

which is mathematically identical to Eq. (7.13b), with the substitution of tqk0 tT k. Applying the first-order approximation of Eq. (7.15), one may end up with qs (tq tT) 1 ('qs/'t) k =T, or qs k =T (tT tq)'( =T )/'t, or even qs Atq 3 t T B 2 ('qs/'t) k=T 3 tT k'(=T )/'t. These equations are merely special cases of Eq. (7.16), after regrouping tq and tT. The only requirement for Eq. (7.16) to make logical sense is that both tq and tT are nonnegative. The reason that a lag in time has been called a phase lag is perhaps because the temperature field can be viewed as a Fourier transform: ` ~ ~ T(r, t) 1` T(r,v)eivtdv, where T (r,v) is the Fourier component at frequency v. The actual phase lag vtT (or vtq for heat flux) depends on the frequency. Equation (7.16) is mathematically more general and has some advantages over the Cattaneo equation. From now on, Eq. (7.14) will be called the lagging heat equation. It is straightforward to include the source terms in the lagging heat equation, as well as to treat thermophysical properties as temperature dependent. The solution, however, becomes more and more difficult as the complexity increases. Numerous studies have appeared in the literature on analytical solutions and numerical methods.4,15–18 It should be noted that in Eq. (7.12), k0 and k1 denote the effective and elastic conductivities, respectively, and are supposed to be nonnegative.3 Therefore, tT must not be greater than tq. In fact, the ratio k0/(k0 k1) is a direct indication of whether thermal behavior can be described by heat diffusion (when 1) or the hyperbolic heat equation (when 0). In general, 0 1, and the thermal process lies somewhere between the two extremes prescribed by Fourier’s law and the Cattaneo equation. In other words, there will be wavelike features in the solution, which is superimposed by an instantaneous diffusive response throughout the medium. The diffusive response here, as well as in Fourier’s law, does not correspond to an infinite speed of propagation. Rather, it is well justified by quantum statistics as explained previously. The dual-phase-lag model relaxes the requirement of tT tq; but in the mean time, it produces a negative thermal conductivity component, i.e., k1 0, according to Eq. (7.12). This drawback has long been overcome by Tzou, who proposed a new memory function in accordance with Eq. (7.16) as follows:15 k qs(r, t) t 3 q

t

`

t tr ' expa t b c=T(r, tr) tT =T(r, tr)d dtr q 'tr

(7.17)

Equation (7.17) suggests that the heat flux depends not only on the history of the temperature gradient but also on the history of the time derivative of =T . When tT 0, Eq. (7.17) becomes the Cattaneo equation. When tT tq, Eq. (7.17) reduces to Fourier’s law. However, tT tq is theoretically permitted because Eq. (7.17) does not presume that the

CHAPTER 7

256

thermal conductivity is composed of an effective conductivity and an elastic conductivity. The inclusion of tT tq makes Eq. (7.16) more general than the original Jeffreys-type equation, which is based on Eq. (7.13a). The extension to the region tT tq enables the lagging heat equation to describe the behavior of parallel heat conduction, which can occur in a number of engineering situations. Sometimes, a microscale phenomenon can be understood easily if a macroscale analog can be drawn. For this reason, let us consider the solid-fluid heat exchanger shown in Fig. 7.3.

Qf(x)

Qconv

Qf(x + dx) Qs(x + dx)

Qs(x) x

x + dx

Side view

x

FIGURE 7.3 Illustration of heat transfer in a solid-fluid heat exchanger, where long solid rods are immersed in a fluid inside a sealed pipe, which is insulated from the outside.

Assume that a fluid is stationary inside a sealed pipe, filled with long solid rods. The pipe is insulated from the outside. If the rods are sufficiently thin, we may use the average temperature in a cross section and assume that heat transfer takes place along the x direction only. Let us denote the temperatures of the solid rods and the fluid by Ts(x,t) and Tf(x,t), respectively, and take their properties ks, Cs (rcp)s, kf, and Cf (rcp)f to be constant. Note that Cs and Cf are the volumetric heat capacities. Given the rod diameter d, the number of rods N, and the inner diameter D of the pipe, the total surface area per unit length is P NpD, and the total cross-sectional areas of the rods and the fluid are Ac Npd 2/4 and Af (p/4)(D2 Nd 2), respectively. Assume the average convection coefficient is h. The energy balance equations can be obtained using the control volume analysis as follows: Cs and

'2Ts 'Ts ks 2 G(Ts Tf) 't 'x

(7.18a)

'Tf G(Ts Tf) 't

(7.18b)

Crf

where G hP/Ac and Crf Cf Af /Ac. In writing Eq. (7.18b), we have assumed that kf V ks and dropped the term kf ('2Tf /'x2). Equations (7.18a) and (7.18b) are coupled equations that can be solved for the prescribed initial and boundary conditions. These are completely macroscopic equations governed by Fourier’s law of heat conduction. Nevertheless, we can combine Eq. (7.18a) and Eq. (7.18b) to eliminate Tf and, consequently, obtain the following differential equation for Ts: '2Ts 2

'x

tT

2 tq '2Ts ' ' Ts 1 'Ts a 2b a

a 2 't 'x 't 't

(7.18c)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

257

where a ks/(Cs Crf), tT Crf /G, and tq CstT /(Cs Crf) tT. The same equation can also be obtained for the fluid temperature Tf. Here, tq does not have the meaning of relaxation time, and the solutions of Eq. (7.18) exhibit diffusion characteristics. Equation (7.18c) is completely physical but should not be viewed as a wave equation; rather, it describes a parallel or coupled heat diffusion process. The concept of dual phase lag can still be applied. It should be noted that, due to the initial temperature difference between the rod and the fluid, a local equilibrium is not established at any x inside the pipe until after a sufficiently long time. Although no fundamental physics can be gained from this example, it can help us appreciate that the lagging heat equation may be useful for describing the behavior in inhomogeneous media. Minkowycz et al. studied the heat transfer in porous media by considering the departure from local thermal equilibrium and obtained higher-order differential equations similar to Eq. (7.18c).19 On the other hand, Kaminski made an experimental attempt to determine tq in the hyperbolic heat equation, by measuring the time interval between when the heat source was turned on and when a temperature signal was detected.22 The heat source and the thermometer used were long needles, placed in parallel and separated by a gap of 5 to 20 mm. What the experiment actually measured was the average thermal diffusion speed vdif if the cylindrical geometry and the initial conditions were properly taken into consideration in the analysis. The main problem with this frequently cited paper and similar studies in the 1990s was that most researchers did not realize that the hyperbolic heat equation is physically unjustified to be superior to the parabolic heat equation; instead, they thought that the parabolic equation was only a special case of the more general hyperbolic equation. It appears that the Cattaneo equation and the associated hyperbolic heat equation are unlikely to be able to characterize any heat transfer problems successfully without additional terms. Many researchers have already expressed doubt about the applicability of the hyperbolic heat equation, though not so many have realized that an instantaneous response is a legitimate property, rather than a drawback of the diffusion equation. Electron gas and phonon gas in solids are quantum mechanical particles, which do not have memory of any kind. Ideal molecular gases obey classical statistics and do not have memory either, unless the deposited energy is too intense to cause ionization or reaction. Does the temperature wave exist? What is a temperature wave anyway? In the early 1940s, Russian theoretical physicist Lev Landau (1908–1968) used a two-fluid model to study the behavior of quasiparticles in superfluid helium II and predicted the existence of a second sound, propagating at a speed between vg /!3 and vg, depending on the temperature. Note that the group velocity is the same as the phase velocity for a linear dispersion. Above the l-point, where superfluidity is lost, the second sound should also disappear. Landau was awarded the Nobel Prize in Physics in 1962 for his pioneering theories of condensed matter at low temperatures. He authored with his students a famous book series in mechanics and physics. Landau’s prediction was validated experimentally (J. Phys. USSR, 8, 381, 1944) by Peshkov, who further postulated the existence of a second sound in crystals, when scattering by defects becomes minimized. It was not until the mid 1960s that the second sound associated with heat pulse propagation was observed in solid helium (below 1 K) and other crystals at low temperatures (below 20 K). The second sound can occur only at very low temperatures when the mean free path of phonons in the U processes, in which the total momentum is not conserved, is longer than the specimen size; while at the same time, the scattering rate of the N processes, in which the total momentum is conserved, is high enough to dominate other scattering processes. It should be noted that while the N processes have a much shorter mean free path than the size of the specimen, scattering by N processes does not dissipate heat (see Sec. 6.5.3). Callaway simplified the BTE for phonon systems by a two-relaxation-time approximation, which should be applicable when t tN: f0 f f1 f 'f 'f

v# t t 't 'r N

(7.19)

CHAPTER 7

258

where t stands for the relaxation time for the U processes, tN is the relaxation time for the N processes, and f0 and f1 are the associated equilibrium distribution functions.21 Guyer and Krumhansl solved the linearized BTE and derived the following equation for the phonon effective temperature: =2T

9tN ' 3 'T 3 '2T =2T 2

2 2 5 't tva 't va 't

(7.20)

where va is the average phonon speed.22 Assuming a linear dispersion, it can be evaluated using Eq. (5.10). Substituting a tv2a/3, tq t, and tT 9tN/5, we see that Eq. (7.20) is identical to Eq. (7.14). The condition t tN can be satisfied even at t t since tN V t. The significance of Eq. (7.20) lies in that the temperature wave or the second sound is not universal, but rather, requires strict conditions to be met.22 When the condition tN V t is satisfied, we have tT V tq and the energy transfer is dominated by wave propagation. At higher temperatures, the scattering rate for the U processes is usually very high, and the N processes contribute little to the heat conduction or thermal resistance, as discussed in Chap. 6. Therefore, the reason why temperature waves have not been observed in insulators at room temperature is not because of the small t, in the range from 1010 to 1013 s, but because of the lack of mechanisms required for a second sound to occur. No experiments have ever shown a second sound in metals, as suggested by the hyperbolic heat equation. Recently, Shiomi and Maruyama performed molecular dynamics simulations of the heat conduction through (5,5) single-walled carbon nanotubes, 25 nm in length, for several femtoseconds.23 They found that the wavelike behavior could be fitted by the lagging heat equation, but could not be described by the hyperbolic heat equation due to local diffusion. The ballistic nature of heat propagation in nanotubes has already been explained in Chap. 5. They suspected that optical phonons might play a major role in the non-Fourier conduction process.23 Tsai and MacDonald studied the strong anharmonic effects at high temperature and pressure using molecular dynamics.24 Their work predicted a second sound response. The coupling of elastic and thermal effects was thought to be important. Studies on thermomechanical effects such as thermal expansion, thermoelasticity, and shock waves can be found from Tzou15 and Wang and Xu,25 and will not be discussed further. Tang and Araki clearly delineated four regimes in the lagging heat equation, according to the ratio tT /tq.17 (1) When 0, it is a damped wave, i.e., hyperbolic heat conduction. (2) When 0 1, it is wavelike diffusion, for which wave features can be clearly seen if V 1. (3) When 1, it is pure diffusion or diffusion, i.e., Fourier’s conduction. (4) When 1, it is called over-diffusion, which makes the dimensionless temperature decay faster than pure diffusion would. In the next section, we will discuss a microscopic theory on short-pulse laser heating of metals, which falls in the regime of overdiffusion, or parallel conduction.

7.1.3 Two-Temperature Model With a short laser pulse, 5 fs to 500 ps, free electrons absorb radiation energy and the absorbed energy excites the electrons to higher energy levels. The “hot electrons” move around randomly and dissipate heat mainly through electron-phonon interactions. Following the work of Kaganov et al. (Sov. Phys. JETP, 4, 173, 1957), Anisimov proposed a two-temperature model, which is a pair of coupled nonlinear equations governing the effective temperatures of electrons and phonons.26 This model was experimentally confirmed later by Fujimoto et al. (Phys. Rev. Lett., 53, 1837, 1984) and Brorson et al. (Phys. Rev. Lett., 59, 1962, 1987). The two-temperature model was introduced to the heat transfer community by Qiu and Tien, who also analyzed the size effect due to boundary scattering and performed experiments with thin metallic films.27 In the two-temperature model, it was

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

259

assumed that the electron and phonon systems are each at their own local equilibrium, but not in mutual equilibrium. The electron temperature could be much higher than the lattice (or phonon) temperature due to absorption of pulse heating. Therefore, Ce

'Te . = # (k=Te) G(Te Ts) qa 't

(7.21a)

Cs

'Ts G(Te Ts) 't

(7.21b)

Here, the subscripts e and s are for the electron and phonon systems, respectively, C is the . volumetric heat capacity, G is the electron-phonon coupling constant, and qa is the source term that represents the absorbed energy rate per unit volume during the laser pulse and drops to zero after the pulse. Heat conduction by phonons is neglected, and thus, the subscript e is dropped in the thermal conductivity k. Note that qs k=Te, according to Fourier’s law. We have already given a macroscopic example of parallel heat transfer, as shown in Fig. 7.3, which should ease the understanding of the phenomenological relations given in Eq. (7.21). Equation (7.21) originates from microscopic interactions between photons, electrons, and phonons. In order to examine the parameters in Eq. (7.21) and their dependence on Te and Ts, let us assume that the lattice temperature is near or above the Debye temperature, for simplicity. In such a case, electron-electron scattering and electrondefects scattering are insignificant compared with electron-phonon scattering. It is expected that the electron relaxation time is inversely proportional to the lattice temperature, i.e., t < te-ph~T1 s . The meaning of the relaxation time is that the electron system can be assumed to be at internal local equilibrium when t t, which is the condition for Eq. (7.21) to be applicable. Boundary scattering may play a role for very thin films or in polycrystalline materials. An effective mean free path can be introduced to modify the scattering rate.27–29 The volumetric heat capacity for the lattice or phonons, Cs rcp, is a weak function of the lattice temperature; the volumetric heat capacity of electrons, from Eq. (5.25), becomes Ce

p2nek2B T gs Te 2mF e

(7.22)

Recall that Ce is relatively small compared with Cs, even at several thousand kelvins. From the simple kinetic theory, the thermal conductivity is k

keq p2nek2B T tTe < Ts e 3me

(7.23)

where keq is the thermal conductivity when Te Ts, which can be set as the room temperature value. The term Te in Eq. (7.23) comes from the heat capacity. The size effect can be included using an effective relaxation time. Theoretically, the coupling constant can be estimated by G

p2menev2a 6tTs

or G

p4(nevakB)2 18keq

(7.24)

which is independent of temperature, when boundary scattering is not important but proportional to the square of the speed of sound in the metal. With the speed of sound in the low-frequency limit, the dispersion is linear; thus, we do not have to worry about the difference between the phase velocity and the group velocity. From Eq. (5.10), we have va

kBD 4p 1/3 a b 3na h

(7.25)

CHAPTER 7

260

When boundary scattering is included, G is expected to increase from the bulk value and depend on the lattice temperature. Using the Debye temperature and for na ne, we have G

nek2B2D nek2B2D < 0.518 3 tTs m 0 12 24 tTs m 0 p2

(7.26)

Typical values of G are on the order of 1016 W/(m3 K), e.g., G < 2.9 1016 W/(m3 # K) for gold. The behavior of the electron and phonon temperatures near the surface is shown in Fig. 7.4, for a short pulse. The electron temperature rises quickly during the pulse and

Pump beam

Probe beam

Thermoreflectance signal

Te

Specimen qa

tp

Ts

t

FIGURE 7.4 Illustration (not to scale) of ultrafast thermoreflectance experiments and the associated electron and phonon temperatures near the surface, during a short pulse.

begins to decrease afterward; in the mean time, the lattice temperature gradually increases until the electron and lattice systems reach a thermal equilibrium. Both the temperatures will go down as heat is carried away from the surface. Note that the electron temperature can rise very high due to its small heat capacity, but the lattice or solid may be just slightly above room temperature. If the temperatures of electron and lattice were assumed the same, Eq. (7.21) reduces to the simple Fourier heat conduction equation, which in turn predicts a much lower temperature rise, because the heat capacity of the lattice is much higher than that of the electrons. Given such a short timescale and the nonequilibrium nature between electrons and phonons locally, no contact thermometer could possibly measure the effective electron temperature. Experiments are usually performed by the femtosecond or picosecond thermoreflectance technique, also known as the pump-and-probe method, shown in the inset of Fig. 7.4. The reflectance of the surface depends on the electron temperature Te. The experimental setup is rather involved and cannot measure the temperature distribution inside the material. The procedure is to send a pump pulse train that is synchronized with a probe pulse train at a fixed delay time. The electron temperature change near the surface is deduced by comparison of the reflectance measurements at different delay times. Electronphonon coupling, boundary scattering, and thermal boundary resistance can all affect the thermoreflectance signal. Comparing with the model described in Eq. (7.21), along with the

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

261

dependence of the reflectance on the electron temperature, the microscopic characteristics can be analyzed. Ultrafast thermoreflectance techniques have become an important thermal metrology tool for the study of electron-phonon interactions, TBR, and thermophysical properties.30,31 Thermionic emission can also occur from the surface, especially when the electrons are excited to higher energy states.32 Similarly to what has been done for Eq. (7.18), Eq. (7.21a) and Eq. (7.21b) can be combined to formulate partial differential equations for either the electron or phonon temperature. Neglecting the temperature dependence of the parameters, one obtains the following differential equations for the electron temperature and the phonon temperature, respectively, . . tq '2Te qa tT 'qa ' 2 1 'Te = Te tT = Te k k a

a 't 't 't 't 2 . tq '2Ts qa ' 2 1 'Ts 2 = Ts tT = Ts k a

a 't 't 't 2 2

(7.27a)

(7.27b)

where a k/(Ce Cs), tT Cs/G, and tq tTCe /(Ce Cs) < Ce /G V tT. These equations are identical to the lagging heat equations and can be solved with appropriate boundary conditions. The results again belong to the regime of over-diffusion, or parallel conduction, without any wavelike features. Cooling caused by thermionic emission is usually neglected, and the surface under illumination can be assumed adiabatic. A 1-D approximation further simplifies the problem. The solution follows the general trends depicted in Fig. 7.4. The situation will be completely changed if a phase change occurs or if the system is driven to exceed the linear harmonic behavior.15,25 The term tq is clearly not the same as the relaxation time t due to collision. The resulting solution is more diffusive than wavelike. In the literature, tq is commonly referred to as the thermalization time. The physical meaning of tq is a thermal time constant for the electron system to reach an equilibrium with the phonon system. For noble metals at room temperature, the relaxation time t is on the order of 30 to 40 fs, the thermalization time tq is 0.5 to 0.8 ps, and the retardation time tT is 60 to 90 ps. In practice, we need to consider the temperature dependence of the parameters in Eq. (7.21), as mentioned earlier. Some numerical solutions, considering temperature dependence, and comparisons with experiments can be found from Smith et al.33 and Zhou and Chiu.34 Given that the two-temperature model cannot be applied to t t, due to the limitation of Fourier’s law, one may prefer to use a pulse width tp between 100 and 200 fs and measure the response during several picoseconds until the thermalization process is complete, i.e., the electron and phonon temperatures become the same. This first-stage measurement allows the determination of the coupling constant G. In the case of a thin film, the TBR sets a barrier for heat conduction between the film and the substrate. The time constant of the film can range from several tens to hundreds of picoseconds. Therefore, the TBR between the film and the substrate can be determined by continuing the observation of thermoreflectance signals for 1 to 2 ns after each pulse. Fitting the curves in the second-stage measurement allows an estimate of the TBR. Of course, one could use a longer pulse width tp to determine the TBR. Most advanced femtosecond research laboratories are equipped with Ti:sapphire lasers whose pulse widths range from 50 to 500 fs. Femtosecond lasers with a pulse width of 25 fs have also been used in some studies; see for example Li et al. (J. Opt. Soc. Am. B, 15, 2404, 1998; Phys. Rev. Lett., 82, 2394, 1999). For tp below 50 fs, Eq. (7.21a) is not applicable during the heating, at least for noble metals. The relaxation time for Cr is about 3 fs, and Eq. (7.21) can be safely applied even with tp 10 fs. However, the processes below 20 fs may largely involve electron-electron inelastic scattering, thermionic emission, ionization, phase transformation, chemical reaction, and so forth. Other difficult issues associated with the reduced pulse width include widened frequency spectrum, increased pulse intensity, decreased pulse

262

CHAPTER 7

energy, and so forth. A simple hyperbolic heat flux formulation cannot properly address these issues at tp t. One must investigate the physical and chemical processes occurring at this timescale in order to develop a physically plausible model, with or without the concept of effective temperatures. Femtosecond laser interactions with dielectric materials have also been extensively studied (see Jiang and Tsai35 and references therein). Let us reiterate the major points presented in this section: (a) Fourier’s law, which is limited to local equilibrium conditions, does not predict an infinite speed of heat diffusion, nor does it violate the principle of causality. An instantaneous response at a finite distance is permitted by quantum statistics although the probability of such a response sharply approaches zero as the distance increases. An instantaneous temperature change or heat flux at a precise location is not physically possible. Only under the continuum assumption, we can use the concept of sudden change of temperature at the boundary. (b) Heat diffusion is usually a very slow process, compared with the speed of sound. The temperature wave, or the second sound, has been observed only in helium and some very pure dielectric crystals, at low temperatures, where the U processes are ballistic and the N processes have a very high scattering rate. However, the simple hyperbolic heat equation has been proved neither theoretically nor experimentally. There is no need to collect previous or future experimental evidence to test the hyperbolic heat equation, which was ill-formulated in the first place. (c) All kinds of non-Fourier equations are based on some sort of effective temperature, which are not measurable using a contact thermometer. The principle of contact thermometry is the zeroth law of thermodynamics, which originates from the theory of thermal equilibrium. The concept of coldness or hotness should be abandoned in reference to nonequilibrium energy transport processes. Noncontact thermometry, on the other hand, relies on certain physical responses to deduce the equilibrium temperature or the effective temperature of the system being measured. (d) The memory hypothesis and the lagging argument are phenomenological models that may be useful in the study of certain nonequilibrium or parallel conduction processes, but are not universally applicable. These and similar equations must be derived and applied on a case-by-case basis. It is important to understand the microscopic processes occurring at the appropriate length scales and timescales in order to develop physically reliable models.

7.2 HEAT CONDUCTION ACROSS LAYERED STRUCTURES In Sec. 5.5.2, we have given a detailed discussion on the heat conduction along a thin film using the BTE, under the local equilibrium assumption. An effective thermal conductivity can be used after taking proper account of boundary scattering. The heat conduction problem can thus be well described by Fourier’s law using the effective thermal conductivity. As mentioned earlier, for heat transfer across a film or a superlattice, the condition of local equilibrium breaks down in the acoustically thin limit. The local distribution function cannot be approximated by an equilibrium distribution function at any temperature. Conventional Fourier’s law breaks down because it relies on the definition of an equilibrium temperature and the existence of local equilibrium. It is natural to ask the following two questions: (1) Is it possible for us to define an effective temperature? (2) Can Fourier’s law still be useful in the nonequilibrium regime, according to the effective temperature? This section presents the equation of phonon radiative transfer (EPRT) and the solution of EPRT for thin films under the relaxation time approximation. A resistance network representation is present to illustrate how Fourier’s law of heat conduction may be applied inside the medium, at least approximately, with temperature-jump boundary conditions. Because of the importance of understanding the boundary conditions, this section also discusses models of thermal boundary resistance (TBR) in layered structures.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

263

7.2.1 Equation of Phonon Radiative Transfer (EPRT) The phonon BTE under the relaxation time approximation, in a region with heat generation, may be written as f f0 'f 'f (7.28)

v#

S0 't 'r t(v,T ) where the second term S0 on the right-hand side is a source term to model the generation of phonons due to heat dissipation, such as electron-phonon scattering. Phonon-phonon scattering is already included in the first term on the right-hand side. The scattering rate may also include phonon-defect scattering. Many studies have treated phonon transport in analogy to thermal radiative transfer.12, 36–43 In the following, a simplified case is used to illustrate how to model heat transfer across a thin film as well as multilayer structures. Let us consider a film of thickness L between two boundaries without any internal source. The phonon BTE becomes f0 f 'f 'f

vx t 't 'x

(7.29)

Realizing the nonequilibrium distribution function may be anisotropic, let us define Iv(x,,t)

1 vg Uv f D(v) 4p a P

(7.30)

where P is the number of phonon modes or polarizations. Equation (7.30) gives the phonon intensity, which is the energy transfer rate in the direction from a unit area, per unit frequency and per unit solid angle. The geometry of the problem and illustration of the intensity is given in Fig. 7.5. In this section, we use vg for the group velocity and vp for the

q''w T1 +

Iw

_ Iw q

q

x

I ∗w

T2

FIGURE 7.5 Schematic of phonon radiative transfer inside a dielectric medium between two walls maintained at temperatures T1 and T2. These walls are like heat reservoirs, but their surfaces are not necessarily blackbodies.

phase velocity. Note that vx vgcos u, where u is the polar angle. Substituting Eq. (7.30) into Eq. (7.29), we obtain 'Iv I*v Iv 1 'Iv vg 't m 'x vgt

(7.31)

264

CHAPTER 7

where m cos u and Iv* (v,T) is the intensity for equilibrium distribution that is independent of the direction. Equation (7.31) is called the equation of phonon radiative transfer (EPRT).12,40 Comparing the EPRT with the ERT given in Eq. (2.52), we see that the scattering terms are neglected in the EPRT, and the emission and the absorption are replaced by the phonon collision terms. The phonon mean free path vgt is also called the phonon penetration depth (see Example 4-2). The inverse of the penetration depth 1/ corresponds to the absorption coefficient in the ERT. Conversion to the EPRT allows wellestablished theories and numerical techniques, developed in radiative transfer, to be applied to solve Eq. (7.31) and to interpret the physical significance of the solutions.44,45 If t does not depend on frequency, we are dealing with a gray medium. If the phonon Knudsen number Kn /L V 1, then most phonons will collide with phonons or defects inside the medium. This regime is called the acoustically thick limit, in analogy to the optically thick limit for photons. This is also known as the macroscale regime or the local equilibrium situation. Unless at a very short timescale, when a sudden local disturbance occurs, we expect that Fourier’s law is applicable and the heat conduction is by diffusion. On the other hand, if Kn /L W 1, phonons originated from one boundary will most likely reach the other boundary without colliding with other phonons or defects inside the medium. This is the ballistic regime, corresponding to free molecule flow for molecular gases. This regime is called the acoustically thin limit, where the phonon distribution inside the medium cannot be characterized by an equilibrium distribution function if the walls are at different temperatures, even in the steady state. Because we are dealing with the radiative transfer for phonons as we do for photons, from now on, we will refer Kn /L W 1 as the radiative thin limit and Kn /L V 1 as the radiative thick limit. Because the BTE is more fundamental than Fourier’s law, it works for either limit as well as between the two limits. It would be very useful if a macroscopic model can also be developed to bridge these two limits. Rather than referring readers to more specialized journal papers, in the following, we present some basic formulations that are logically connected with materials presented in earlier chapters. Note that Iv0 is the equilibrium distribution function, which is independent of the direction. Using Bose-Einstein statistics, we have vgUv k2 dk Uv3 I*v(v,T ) a Uv/k T a 3 2 Uv/k T 3 dv B B (2p) e 1 8p v (e 1) P P p

(7.32)

This equilibrium distribution is also the distribution function for blackbody radiation with vp replaced by the speed of light. Integrating Eq. (7.32) over all frequencies gives the total intensity for all three phonon modes as follows: ` srSBT 4 3k4BT 4 ` x3dx I *(T ) 3 Iv* (v,T )dv x p 8p3U3v2a 30 e 1 0

(7.33)

where srSB p2k4B/(40U3v2a) is the phonon Stefan-Boltzmann constant, and va is the average phase velocity of the two translational and one longitudinal phonon modes, defined according to Eq. (5.7). Let us consider a solid at temperatures higher than the Debye temperature. The integration can be carried out to an upper limit vm with xm Uvm/kBT V 1. From the discussion following Eq. (5.13), one can easily show that vm v3mkB T I *(T ) 3 Iv*(v,T )dv 8p3v2p 0

(7.34)

This integration is a good approximation, even at temperatures slightly lower than the Debye temperature. When phonons are at equilibrium, the energy flux is pI*, which is

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

265

obtained by integrating I *cos u d over the hemisphere. According to Eq. (4.12), the energy density can be expressed as 4p u(T ) v I *(T ) g

(7.35)

Note that the volumetric heat capacity C du/dT . We therefore obtain the low-temperature relation of the specific heat, i.e., the T 3 law, and the high-temperature relation of the specific heat, i.e., the Dulong-Petit law, as already derived in Sec. 5.1.2. It is important to pay attention to the meaning of C in the kinetic expression of thermal conductivity: k

1 2 Cv t 3 g

(7.36)

At very low temperatures, when T V D, C is the volumetric heat capacity of all phonon modes combined because only low-frequency modes or acoustic branches contribute to the specific heat. However, at temperatures close to the Debye temperature, phonons in the optical branches contribute little to the thermal conductivity, as already discussed in Chap. 6. The relative contributions of LA and TA branches are also temperature dependent. The Debye temperature for most materials, except diamond, is not much higher than room temperature (see Table 5.2). Therefore, we must treat C as a fraction of the volumetric specific heat in dealing with Si, GaAs, Ge, ZnS, or GaN, near room temperature. Also, we must use the appropriate upper limit in the integral in calculating the total energy transfer when applying the EPRT. The heat flux per unit frequency interval can thus be expressed as qsv 3 Ivcos u d 2p 3 4p

1

1

Iv mdm

(7.37)

Energy balance at any given location requires that the incoming flux be the same as the outgoing flux, for both steady and transient states. This is the criterion for radiative equilibrium, which can be expressed as follows: vm 1

vm

1 * 1 4p 3 Iv dv 2p 3 3 Ivdmdv v 0 v

(7.38)

0 1

where v is the mean free path at v, 4p on the left-hand side came from the integration over all solid angles in a sphere, and 2p on the right-hand side came from integration over the azimuth angles. Equation (7.38) gives a definition of an effective phonon temperature T * based on Iv* (T *,v). An equivalent expression can be obtained based on the energy density, viz., u(T * ) a a Uvf(v,) P

(7.39)

K

It follows that the local equilibrium condition can be rewritten as 1

Iv*

1 I dm 2 31 v

(7.40)

Local equilibrium is a sufficient, but not necessary, condition for radiative equilibrium given in Eq. (7.38), regardless whether the medium is gray or not. The physical significance of Eq. (7.40) is that the angular average of the intensity, at a given location and time, can be described by an equilibrium intensity at the effective temperature. Obviously, Eq. (7.40) is not applicable in the radiative thin limit, unless the temperature difference between the two boundaries is negligibly small.

CHAPTER 7

266

For a dielectric medium of thickness L 0.01, where is independent of wavelength. The boundary or wall temperatures are T1 100 K and T2 20 K. Both the temperatures are much lower than the Debye temperature. Assume that reflection at the boundaries is negligible, i.e., the walls can be modeled as blackbodies. Find the steady-state temperature of the medium and the heat flux through the medium.

Example 7-3.

Because Kn /L W 1, the medium is said to be in the radiative thin limit, in which phonons travel from one wall to another ballistically with little chance of being scattered by other * phonons or defects inside the medium. The forward intensity can be expressed as I

v Iv(T1,v) for * for . From Eq. (7.37), we have I m 0, and the backward intensity I (T ,v) m 0 v v 2 Solution.

` 1

`

4 4 qsx 3 qsv dv 2p 3 3 (I

v Iv )mdmdv srSB (T1 T2 )

0

(7.41)

0

0

For heat conduction, the above equation is called the Casimir limit (Physica, 5, 595, 1938). To numerically evaluate this equation, we need data for va. From Eq. (7.38), we have v

srSB T 4

p m

1 (I I (sr T 4 srSB T24) v )dv 2 30 v 2 SB 1

(7.42)

where T is the effective temperature inside the medium 0 x L. Since T(0) T1 and T(L) T2 are the boundary conditions, there is a temperature jump at each boundary. We notice immediately that Eq. (7.40) cannot be satisfied with the temperature defined previously. If we force 1

I*v

1 1

I I dm (I

v) 2 31 v 2 v

(7.43)

we would end up with different temperatures at each frequency. In the next chapter (Sec. 8.2.3), we will further discuss the concept of monochromatic temperature. If the walls are not black but diffuse-gray with emissivities e1 and e2, similar to Eq. (2.51), the heat flux becomes qsx

srSB T 41 srSBT24 1/e1 1/e2 1

(7.44)

7.2.2 Solution of the EPRT The two-flux method is very helpful in developing a solution of the EPRT in planar structures. The equations for the forward and backward intensities, denoted respectively by superscripts ( ) and () can be separated. In the steady state, we have: 'I

I*v I

v v , when 0 m 1 'x 'I I*v I v v , when 1 m 0 m 'x

m

(7.45a) (7.45b)

where we have assumed that the medium is gray.44,45 If we further assume that the walls are diffuse and gray, then the boundary conditions become T(0) T1 and Thus,

T(L) T2

(7.46)

* I

v (0, m) e1Iv(T1) (1 e1)Iv (0, m)

(7.47)

*

I v (L, m) e2Iv(T2) (1 e2)Iv (L, m)

(7.48)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

267

The solutions of Eq. (7.45a) and Eq. (7.45b) can be expressed as follows: x x j dj x b 3 Iv* (j) exp a b m m m 0

for m 0 (7.49)

L x j dj Lx b 3 Iv* (j) exp a b m m m x

for m 0 (7.50)

I

v (x,m) Iv (0,m)expa

and I v (x,m) Iv (L,m)expa

In Eq. (7.49), the first term represents intensity originated from the left surface, after being attenuated, and the second term is the contribution of generation that is subject to attenuation as well. Equation (7.50) is viewed reversely for intensity from the right to the left. The spectral heat flux, defined in Eq. (7.37), can be written as x Lx qsv 2p 3 cI

b I b d mdm v (0,m) exp a v (L,m) exp a m m 1

0

2p 3

0

Here again, Em(x) fuse, then

Iv* (j)E2 a

x

1 m2 x/m 10 m e

L x j dj j x dj b 2p 3 Iv* (j)E2 a b x

(7.51a)

dm is the mth exponential integral. If the surface is dif-

Lx x qsv 2pI

b v (0)E3 a b 2pIv (L)E3 a L j x dj x j dj b 2p 3 Iv* (j)E2 a b

2p 3 Iv* (j)E2 a x 0 x

(7.51b)

dqsx vm ' Note that energy balance requires that 10 qs (x,v)dv 0. Differentiation of 'x v dx Eq. (7.51a) yields 'qsv x Lx 2p 2p (0)E2 a b I (L)E2 a b I

'x v v L Zx jZ dj 2p 4p * 3 Iv* (j)E1 a b

I (x) 0 v

(7.52)

In radiative transfer, we call J1 1pI

v (0)dv and J2 1 pIv (L)dv the total radiosities at surfaces 1 and 2, respectively, and eb(T ) 1pIv* dv the total blackbody emissive power. Therefore,

L Zx jZ dj x Lx 2eb(T(x)) J1E2 a b J2E2 a b 3 eb(T(j))E1 a b 0

(7.53)

This is the same as the radiative equilibrium condition, given in Eq. (7.38). We cannot set Eq. (7.52) to zero at all frequencies, when local equilibrium does not exist, even for a gray medium. Find the temperature distribution, the heat flux, and the thermal conductivity for a gray medium, with diffuse-gray surfaces, in the radiative thick limit, i.e., Kn V 1, under two extreme conditions: (1) T1, T2 V D and (2) T1, T2 D. Example 7-4.

CHAPTER 7

268

In the radiative thick limit, the first two terms in Eq. (7.51a) can be dropped as long as x is not too close to either surface. Applying the first-order Taylor expansion I *"(x) I *"(j)

(dI*v /dx)(x j) c and letting z (x j)/ in the third and fourth terms, we obtain Solution.

qsv 4p

* 'Iv* ` 4p 'Iv zE (z)dz 'x 30 2 'x 3

(7.54)

since 10 zE2(z)dz 1/3. In fact, this equation applies to everywhere inside the medium because the spectral heat flux is continuous in the radiative thick limit. Integrating Eq. (7.54) over the frequencies of interest, we see that, under condition (1), `

qsx

16srSBT 3 dT , when T V D 3 dx

(7.55a)

This is nothing but a heat diffusion equation if we define the thermal conductivity as k(T ) (3/16)srSBT 3

(7.55b) 9 Cvg 16 srSB T 3.

Comparing with k(T) 31 Cvg , we notice from the previous equation that In the radiative thick limit, the temperature distribution is continuous at the wall, i.e., T(0 ) T(0) T1 and T(L) T(L) T2. Furthermore, the radiosity at the wall becomes the blackbody emissive power, even though the surface is not black; thus, we can integrate Eq. (7.54) over x from 0 to L: L

T

2 4 3 3 qsx dx 3 srSB 3 4T dT 0 T 1

which gives

4 qsx KnAsrSBT14 srSBT24 B 3

(7.56a)

as well as the temperature distribution:

T(x) CT14

x 4 AT T24 BD 1/4 L 1

(7.56b)

which is linear in terms of the fourth power of temperature. From the definition of thermal resistance, qsx (T1 T2)/Rst , we have Rst

3(T1 T2)(T12 T23) 4srSBKn

(7.57)

Under condition (2), when the temperature is greater than the Debye temperature, we have qsx

v3mkB 6p2v2p

dT dx

when T D

(7.58)

The thermal conductivity becomes k(T) v3mkB /(6p2v2p), which implies that Cvg v3mkB/(3p2v2p). A proper vm should be chosen so that only propagating phonons are considered. Assuming that the temperature difference is small so that we can approximate the thermal conductivity as a constant, we have qsx

CvgKn 3

(T1 T2)

(7.59)

The thermal resistance becomes Rst 3/(CvgKn), which increases as L increases. The temperature distribution is linear. One should realize that the scattering rate increases with temperature, due to phonon-phonon scattering, and depends on the frequency. If we look at the radiative equilibrium * condition again, by assuming T1 T2, we see that I

v Iv Iv . Therefore, local equilibrium is not

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

269

a stable-equilibrium state. In the radiative thick limit, the difference between I

v and Iv is caused by the spatial variation of I*v as can be clearly seem from Eq. (7.49) and Eq. (7.50). Hence, Eq. (7.40) is a good approximation. In the radiative thin limit, according to Eq. (7.34), Eq. (7.59) becomes

qsx

1 Cv (T T2) 4 g 1

(7.60)

Although no closed form exists for the solution of the ERT between the thick and thin limits, a number of approximation techniques and numerical methods can be used to provide satisfactory solutions, such as the discrete ordinates method (SN approximation) and the spherical harmonics method (PN approximation). It is important to see that except in the radiative thick limit, energy transfer occurs inside the medium in two ways: one is through exchange with the walls, and the other is through diffusion. For this reason, a ballisticdiffusion approximation has been developed to solve the EPRT; see Chen (Phys. Rev. Lett., 86, 2297, 2001). In general, the temperature distribution looks like that in Fig. 4.12b if T2 is comparable to the Debye temperature. If T1 V D, then the temperature distribution can be plotted in terms of T 4 so that the distribution looks more or less linear. There exists a temperature jump such that T(0 ) 2 T(0) and T(L) 2 T(L), except in the radiative thick limit. Understanding that the temperature is only an effective temperature and given such a temperature distribution, one may assume that there is a thermal resistance at each boundary and an internal thermal resistance, which may be described by Fourier’s heat conduction.41 For thermal radiative transfer in the absence of heat conduction, there exists a radiation slip or radiation jump at the boundary, unless the medium is optically thick. Without a participating medium, photons do not scatter on itself to dissipate heat or transfer heat by diffusion. This is a distinction between photons and phonons. Radiation slip is manifested by a discontinuous change of the intensity at the boundary. The temperature in the medium adjacent to the wall differs from the surface temperature. Such a temperature jump does not exist in classical Fourier’s heat conduction theory; however, both velocity slip and temperature jump have already been incorporated in microfluidics research, as discussed in Chap. 4 [see Eq. (4.94)]. The temperature-jump concept was first applied in the study of heat conduction in rarefied gases over 100 years ago. A straightforward approach for phonon transport is to sum up the thermal resistances in the radiative thin and thick limits. The heat flux at very low temperatures can be expressed as qsx

4 3L

srSBT14 srSBT24 1 1 1 1 4Kn 1 ae e b 2 2 3 1 2

(7.61)

Here, we separately write (1/e1 1/2) and (1/e2 1/2) to emphasize the thermal resistance due to radiation slip at each boundary. In the radiative thick limit, the temperature jump approaches to zero as Kn S 0. Basically, Eq. (7.61) reduces to Eq. (7.44) and Eq. (7.56a), in the extremes. If the walls can be treated as blackbodies, i.e., e1 e2 1, and the temperature difference between T1 and T2 is small, we can approximate the heat flux as follows: qsx

kb

T

T keff L 1 4Kn/3 L

where T T1 T2 V T2 T1, the bulk thermal conductivity kb(T) the effective conductivity of the film is keff

kb 1 4Kn/3

(7.62) 16 3 3 srSBT ,

and

(7.63)

270

CHAPTER 7

At relatively high temperatures close to the Debye temperature, from Eq. (7.59) and Eq. (7.60), we can write kb T1 T2 T1 T2 qsx keff L L (7.64) 4Kn 1 1 1 a e e 1b 3 1 2 where kb(T) 31Cvg . Equation (7.64) gives the same conductivity ratio keff /kb as in Eq. (7.63) for blackbody walls. These effective thermal conductivities are on the same order of magnitude as we have derived in Sec. 5.5.5, based on simple geometric arguments and Matthiessen’s rule for the mean free path given in Eq. (5.116). In previous chapters, however, we did not elaborate in detail the nature of nonequilibrium and the necessity of defining an effective temperature. It is interesting that different schools of thought can result in rather consistent results. The heat diffusion equation per se cannot tell us the cause of a temperature jump or how to evaluate it. The phonon BTE enables us to explore the microscopic phenomena and helps evaluate the parameters and the properties. The microscopic understanding and the macroscopic phenomenological equations can indeed work together to provide an effective thermal analysis tool. The preceding discussions are consistent with the detailed derivation of the temperature jump or the radiation slip, originally formulated by Deissler (J. Heat Transfer, C86, 240, 1964), for situations not too far from the radiative thick limit. Nevertheless, the expressions given in Eq. (7.61) and Eq. (7.64) can be approximately applied between the diffusion and ballistic extremes. It should be noted that when the temperature jump is treated as a thermal resistance at the boundary, Fourier’s law can be used for the heat conduction inside the medium with bulk thermal conductivity. This is very different from heat conduction along the film. While there seems to be no problem in understanding the meaning of emissivity for optical radiation, a question still remains as how to interpret the boundary conditions in the case of phonon conduction. If a multilayer structure is considered, we need to better understand the reflection and the transmission of phonons at the interfaces between dissimilar materials. A three-layer structure is shown in Fig. 7.6 to illustrate the temperature distribution

FIGURE 7.6 Temperature distribution in a multilayer structure, with thermal boundary resistance, and the thermal resistance network representation. Here, Ri is the internal resistance in the ith layer due to heat conduction, and Rij is the thermal boundary resistance between the ith and jth media. Two temperatures are needed to specify the effective temperature of different media at the interface.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

271

in a multilayer structure. Depending on the temperature range, it seems that we can conveniently determine the internal thermal resistance with Fourier’s law, i.e., Rsi Li /ki. For the thermal resistance at the interface inside the layered structures, we could replace the 1 4 1 1 emissivity with the transmissivity ij such that Rsij 34(i /ki)(1 ij 2 ) 3 (j /kj)(ji 2 ). 4 1 1 4 1 1 At the boundaries, we can still use RsH1 3(1/k1)(e1 2) and Rs3L 3(3/k3)(e3 2). The heat flux can be estimated by qsx (TH TL)/Rstot, where Rstot is the sum of all thermal resistances. The effective thermal conductivity of the whole layered structure becomes keff Ltot/Rstot. The details were presented by Chen and Zeng, who further considered nondiffuse surfaces and defined equivalent equilibrium temperatures.41 The assumption is that the deviation from the radiative thick limit is not significant. If we are dealing with the ballistic regime, we might need to consider phonon wave effects as well as the quantum size effect. Models for thermal boundary resistance will be discussed in the next subsection. It is intriguing to apply the same approach to electron systems for the study of both electrical conductivity and thermal conductivity of metallic solids, as well as metaldielectric multilayer structures. The thermal resistance network method, however, cannot be easily extended to multidimensional problems or to transient heating by a localized heat source. Statistical models or atomistic simulations are necessary. Therefore, the extension of Fourier’s law for 1-D nonequilibrium heat transfer should be considered only as a special case.

7.2.3 Thermal Boundary Resistance (TBR) Thermal resistance at the interface between dissimilar materials is very important for heat transfer in heterostructures. Let us first clarify the difference between thermal contact resistance and thermal boundary resistance (TBR). The former refers to the thermal resistance between two bodies, usually with very rough surfaces whose root-mean-square roughness srms is greater than 0.5 m, brought or joined together mechanically. For thermal contact resistance, readers are referred to a recent comprehensive review by Yovanovich.46 Originally, TBR refers to the resistance at the interface between two solids or between a liquid and a dielectric at low temperatures. Even when the materials are in perfect contact with each other, reflections occur when phonons travel toward the boundary, because of the difference in acoustic properties of adjacent materials. In practice, the interface can be atomically smooth, or with a roughness ranging from several tenths of a nanometer to several nanometers. The thermal resistance between a solid material and liquid helium is called the Kapitza resistance, first observed by the Russian physicist and 1978 Nobel Laureate Pyotr Kapitza, in the 1940s. This thermal resistance results in a temperature discontinuity at the boundary and has been modeled, based on the acoustic mismatch model (AMM). Thermal boundary resistance exists between two dielectrics as well as between a metal and a dielectric. In a thin-film structure, an interface is often accompanied by the formation of an intermediate layer of mixed atoms. An extensive review of earlier studies can be found in the work of Swartz and Pohl in 1989;38 see also Stoner and Maris (Phys. Rev. B, 48, 16373, 1993). Prasher and Phelan (J. Supercond., 10, 473, 1997) reviewed the studies of TBR of high-temperature superconductors in both the normal and superconducting states, for applications in superconducting electronics and radiation detectors. Little showed that the heat flux across the boundary of a perfectly joined interface between two solids is proportional to the difference in the fourth power of temperature on each side of the interface.39 This can be understood based on previous discussions of phonon radiative transfer and blackbody radiation. Consider longitudinal phonon modes that follow the linear dispersion in a Debye crystal, and assume that the interface is perfectly smooth.

CHAPTER 7

272

At any given frequency, the transmission coefficients can be written as follows (with a small modification for consistency):39,47 t12 t21

4r1r2v2l1 cos u1 cos u2 (r1vl1 cos u2 r2vl2 cos u1)2 4r1r2v2l2 cos u1 cos u2 (r1vl1 cos u2 r2vl2 cos u1)2

(7.65a) (7.65b)

where subscripts 1 and 2 denote the media 1 and 2, respectively, r is the density, vl is the propagation speed of longitudinal phonons, and u is the polar angle, as illustrated in Fig. 7.7. The scattering is assumed to be purely elastic since the phonon frequency is conserved. An analog of Snell’s law can be written as follows: 1 1 (7.66) vl1 sin u1 vl2 sin u2

FIGURE 7.7 Schematic of phonon transport across an interface between two semi-infinite media, each at a thermal equilibrium.

If vl1 vl2, for incidence from medium 2 to 1, there exists a critical angle uc sin1(vl2/vl1), beyond which all phonons will be reflected. Due to the boundary resistance, there will be a temperature difference across the interface. By assuming that the phonons are at equilibrium on either side, the heat flux from medium 1 to 2 can be expressed as follows: vm 2p p/2

qs1S2

1 Uvvl1 f1(v, T1)t12D(v)cos u1sin u1du1df1dv 4p 3 3 3

(7.67)

0 0 0

If the distribution function is isotropic over the hemisphere, we have v

qs1S2

1 12 m Uvv3l1 f1(v, T1)D(v)dv 4 v2l1 30

2p p/2

where

1 12 p 3 3 t12 cos u1sin u 1du1df 2 3 0 0

(7.68)

p/2

0

t12 cos u1sin u1du1

(7.69)

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

273

can be viewed as the hemispherical transmisivity. Note that 21 v2l2

uc t21 12 2 3 2 cos u2 sin u2du2 2 vl1 0 vl2

(7.70)

For the Debye density of states, we have Uv3 1 n Uvf (v,T)D(v)dv 4p l 8p3v2l (eUv/kBT 1) Therefore, the net heat flux across the interface becomes v

1 12 m Uv S v3l1 f1(v,T1) v3l2 f2(v,T1) T D(v)dv 4 v2l1 30

qsx qs1S2 qs2S1

or

qsx

m2 m1 x3dx x3dx 4 4 aT T x 2 1 30 ex 1 b 3 e 1 v2l1 8p2U3 0

12 k4B

(7.71)

x

x

(7.72)

where xmj Uv/kBTj. In the low-temperature limit, we obtain qsx

12 p2k4B v2l1 120 U3

AT14 T24 B

(7.73)

2 2 2 After replacing v2 l1 with a vj1 vl1 2vt1 , i.e., one longitudinal and two transverse phonon modes, we obtain j

qsx

p2kB4 120 U3

AT14 T24 B12 a v2 j1

(7.74)

j

The TBR can now be obtained as Rsb (T1 T2)/qsx. Furthermore, by assuming that the temperature difference is small, we can approximate Rsb by Rsb

30U3T3

p2k4B12 a v2 j1

(7.75)

j

which is inversely proportional to T 3. The characteristic wavelength is the most probable wavelength in the phonon distribution function. It can be approximated by lmp < a

D T

(7.76)

where a is the lattice constant, on the order of 0.3 to 0.6 nm.47 Only when lmp W srms, can we assume that the scattering is completely specular. Even for atomically smooth interfaces, the characteristic wavelength for phonons will be on the same order of magnitude as the rms surface roughness, when the temperature approaches the Debye temperature. The specularity parameter introduced in Eq. (5.131) is often used to approximate the fraction of specular reflection with respect to the total reflection. Another expression of the specularity parameter is p exp a

16p2s2rms l2

b

(7.77)

CHAPTER 7

274

This equation has often been wrongly expressed with p2 being mistaken as p3 in the heat conduction literature, following a hidden typo in Ziman’s book, Electrons and Phonons.48 In the high-temperature limit, TBR is expected to be small, especially when compared with conduction in the solids. Other considerations are (a) the interface may not be perfectly smooth, (b) there exists an upper limit of the frequency or a lower limit of wavelength, and (c) phonons on either sides of the boundary may not be in a local-equilibrium state. These difficulties post some real challenges in modeling TBR. Nevertheless, we shall present the diffuse mismatch model (DMM) that was introduced by Swartz and Pohl.38 In the DMM, it is assumed that phonons will be scattered according to a probability, determined by the properties of the two media but independent of where the phonons are originated. For phonons coming from medium 1, the transmission and reflection probabilities are related by 12 R12 1. For phonons originated from medium 2, on the other hand, 21 R12 and R21 12. Hence, the reciprocity requires that (7.78a)

12 21 1 We can rewrite Eq. (7.70), considering all three polarizations, as follows: 2 12 a v2 j1 21 a vj2

(7.78b)

j

j

Solving Eq. (7.78a) and Eq. (7.78b), we get a vj2

2

12

j

2 2 a vj1 a vj2

j

(7.79)

j

The heat flux can be calculated according to qsx

k4B 2 3

8p U

aT14 3

m2 x3dx x3dx T24 3 b12 a v2 x j1 0 e 1 j 0 e 1

xm1

x

x

(7.80)

Equation (7.79) and Eq. (7.80) are the only equations needed to calculate TBR with the DMM. In addition to the Debye temperatures and the speeds of longitudinal and transverse waves, one would need to determine the upper limits of the integrals in Eq. (7.80). Alternatively, Eq. (7.80) can be recast using the volumetric heat capacities and the group velocities to obtain 1 qsx AC1vg1T1 C2vg2T2 B12 4

(7.81)

One must be careful in applying the heat capacity in Eq. (7.81) since the heat capacity in the expression of thermal conductivity is different from rcp, unless at very low temperatures. Both the AMM and the DMM assume that the phonons are in equilibrium on each side of the interface, and do not take into account the nonequilibrium distribution of phonons. In multilayer thin films, especially in quantum wells and superlattices, when the film thickness is comparable with or smaller than the phonon mean free path, thermal transport inside the film cannot be modeled as pure diffusion anymore. A detailed treatment of temperature-jump conditions and boundary resistance in superlattices was performed by Chen and Zeng.40,41 Majumdar (J. Heat Transfer, 113, 797, 1991) proposed a modified AMM by modeling interface roughness using a fractal structure. In this study, the reflection was approximated by geometric optics, which is applicable when the phonon wavelength is smaller than the autocorrelation length of the rough surface. TBR between highly dissimilar materials, metal-metal interface, and metal-dielectric interface have been the areas of some recent studies; see Majumdar and Reddy (Appl. Phys. Lett., 84, 4768), Ju et al.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

275

(J. Heat Transfer, 128, 919, 2006), Lyeo and Cahill (Phys. Rev. B, 73, 144301, 2006), and Gundrum et al. (Phys. Rev. B, 72, 245426, 2006). Let us consider how to model transient heat conduction in thin films. The single relaxation time approximation appears to be limited to the timescale t t. Joshi and Majumdar (J. Appl. Phys., 74, 31, 1993) performed a transient analysis of the EPRT. However, the use of Eq. (7.43) implies the presumption of the local-equilibrium condition, which is not valid for large temperature gradients. If worked out properly, the EPRT is applicable for t t. If the temperature jumps can be properly taken into consideration, it appears that Fourier’s law should be applicable when t W t. The question is, “How long will it take for the temperaturejump conditions to be justified?” In order to model a timescale less than t, it would be interesting to see if there exists another scattering mechanism that has a much smaller relaxation time than t and that does not transfer or dissipate heat, like the N processes discussed earlier or phonons in the optical branch. In the two-fluid model, the superfluid moves forward freely, without any viscosity, but conserves kinetic energy as it moves around. This is the principle of superfluidity in liquid helium and superconductivity for electrons. The superfluid does not carry thermal energy, nor does it dissipate heat. Although the N processes do not carry heat forward, these processes are important for the redistribution of phonons. The two-relaxation-time model developed by Callaway21 and Guyer and Krumhansl22, or the Jeffrey-type equation, might be applicable in extreme cases, e.g., in a nanotube at very small timescales, on the order of femtoseconds. The solution describes a wavelike characteristic that is a combination of a damped wave and a weak diffusion process, which enables an instantaneous response, intrinsic to all heat conduction processes, as justified by statistical mechanics. As mentioned earlier, the wavelike behavior has been demonstrated recently in SWNTs, via molecular dynamics simulation, although a lot of work needs to be done to extend the simulation to multilayer structures. Let us emphasize again that Fourier’s heat diffusion appears to be universal for heat conduction, and the hyperbolic heat equation, Eq. (7.5), can be neither physically justified nor practically useful. In the extremely acoustic thin limit, we are dealing with quantum conductance or the Schrödinger wave equation. This wave phenomenon cannot be explained by the hyperbolic heat equation.

7.3 HEAT CONDUCTION REGIMES There has been a continuous effort to delineate the regimes of microscale heat conduction since 1992. A number of references have already been cited in Chap. 5. A recent effort has been made by Escobar et al.43 Following the previous discussions in this chapter, let us schematically depict the regimes of heat conduction, especially by electrons and phonons in crystalline solids, as in Fig. 7.8. Here, tc is known as effective collision interaction time, or simply collision time, since collision does not occur instantaneously but is through intermolecular potential and force interactions. These forces become important only when the particles come very close to each other. Of course, this is the classical picture of atomic or molecular interactions. Electrons and phonons are quantum mechanical particles; thus, the interaction is via the wavefunctions predicted by Schrödinger’s equations. For ultrafast pulse heating, the collision time can be the time required for a photon and an electron to interact. Generally speaking, the collision time is much shorter than the relaxation time and neglected in the BTE. The characteristic phonon or electron wavelength l is assumed to be less than the mean free path . Region 1 is the macroscale regime where Fourier’s law and the heat diffusion equation can be applied, when the timescale is greater than t and the length scale is greater than about 10. Region 2 is called the mesoscale or quasi-equilibrium regime, which is characterized by the classical size effect. This region is also known as the first microscale. For

276

CHAPTER 7

FIGURE 7.8 Heat conduction regimes.

heat transfer along a film or a wire, local-equilibrium assumption is appropriate and boundary scattering reduces the effective mean free path and thermal conductivity. For heat transfer across a film or a multilayer, it is possible to use Fourier’s law inside the medium by considering an effective temperature and the temperature-jump boundary condition. It is difficult, if not impossible, to apply Fourier’s law to complex geometries or local heating. The two-temperature model for fast laser heating can be in either region 1 or 2, depending on how the length scale is compared with the mean free path. Most of the research on microscale heat transfer between 1990 and 2005 dealt with the microscale phenomena in region 2. Region 3 is the regime of wave behavior, which is described by Schrödinger’s wave equations and where quantum tunneling can occur. Quantum size effect becomes significant on thermal conductivity and specific heat. Quantum conductance is a special case of quantum tunneling, for which the ballistic processes are confined in one dimension through a channel. For very thin layers, wave interference may become important. However, due to the interface roughness, the coherence may be destroyed so that the energy ray method or the particle approach can still be applied at very small length scales. We will give a comprehensive treatment of electromagnetic wave interference and scattering phenomena in subsequent chapters, without discussing the nature of acoustic waves further. The region on the upper left is said to be of no interest at short timescales because a thermal disturbance cannot travel that far and affect the temperature field. Region 4 is designed to represent the wavelike behavior, described by the Jeffreys-type equation, Eq. (7.14). When we say Jeffreys-type equation, we mean that both k0 and k1 in Eq. (7.12a) are positive. As discussed earlier, tN is the second relaxation time for phonon scattering that does not transfer or dissipate thermal energy, as in the N processes. In this regime, the BTE based on the two-relaxation-time approximation may be applied.21,22 This regime includes the heat pulse propagation and the second sound in dielectric crystals, at

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

277

low temperatures. It suffices to say that this region, while of great academic interest, has very limited applications. The pure hyperbolic heat equation, however, predicts a nonphysical wavefront and cannot be applied without the additional diffusion term. Nevertheless, theoretical studies of the hyperbolic heat equation have helped in a better understanding of the heat transfer behavior at short timescales and, subsequently, facilitated the development of more realistic models. While the lagging heat equation can mathematically describe both wavelike behavior and parallel heat conduction, it does not provide any new physics. On the other hand, the memory concept may be related to the anharmonic and nonlinear effects that are inherent to the solid and crystal structures. Study of the thermomechanical and thermoelastic effects, and thermal transport in polymers and inhomogeneous materials, such as biological materials, may require empirical and semiempirical models. The lagging heat equation or similar differential equations may be quite helpful in these applications. Region 5 belongs to the nanoscale regime, where it is necessary to employ quantum or sometimes classical molecular dynamics to study the underlying phenomena. The dashed ellipse indicates the overlapping between different regions, where molecular dynamics simulation may provide rich information as well as a bridge between different timescales and length scales. Holland (Phys. Rev., 132, 2461, 1963) analyzed the effect of different polarizations on the thermal conductivity of germanium. Several studies have employed the Monte Carlo method to solve phonon transport equations.36,37,49 The lattice Boltzmann method has also been employed in a number of publications.14,43 Molecular dynamics has been applied to the study of TBR, including the interface between SWNTs.50–53 Chung et al. investigated the effect of different dispersion models on the lattice thermal conductivity.54 Narumanchi et al. used a finite volume method to solve the 2-D BTE during transient heat transport for a local heating source in silicon.55 They also demonstrated the feasibility to include phonon dispersion and polarization in the model. Nonequilibrium phonon transport in dimensions less than 100 nm has become an important issue in silicon-on-insulator transistors. Several studies have focused on a multiscale approach to model the thermal transport phenomena at the device level.42,43,56–58 Sinha and Goodson provided an extensive review on multiscale modeling.42 Thermal metrology includes measurements of temperature (thermometry), specific heat (calorimetry), and heat flux. Thermophysical properties, such as thermal conductivity and specific heat, can be measured with steady-state, modulated, or pulsed heating techniques. MEMS and NEMS have enabled the fabrication of miniaturized heaters and sensors. Furthermore, optical techniques such as thermoreflectance, Raman spectroscopy, photothermal radiometry, fluorescence, and laser flash techniques have been widely used in the measurement of thermal properties of nano/microstructured materials. Scanning thermal microscopy and near-field optical microscopy have further improved the spatial resolution. A large number of publications can be found from the bibliography of the present and previous chapters [see, e.g., Cahill et al.31 and references therein]. Recently, Abel et al. employed micro–Raman spectroscopy to measure the temperature distribution in silicon microstructures with a spatial resolution of 1 m.59 Lee et al. performed a steady-state characterization of heated AFM cantilevers over a range pressures for thermal metrology applications.60 Park et al. analyzed the frequency response of heated AFM cantilevers in the frequency range from 10 Hz to 1 MHz, and observed high-order harmonic responses, such as 3v, 5v, and 7v, at frequencies below 100 kHz and impedance effect at higher frequencies.61 Park et al. also investigated thermal behavior of heated cantilevers at cryogenic temperatures, down to 78 K.62 By measuring the thermal response at various frequencies, this study extracted the specific heat near the cantilever tip and the thermal conductivity along the heavily doped silicon legs, at temperatures ranging from 80 to 200 K. There appears to be a significant reduction in the thermal conductivity for the free-standing silicon cantilever, with a thickness of 0.59 m, at low temperatures. These studies demonstrate that heated AFM cantilevers have become a promising thermal analysis tool at the micro- and nanoscales.59-62

278

CHAPTER 7

7.4 SUMMARY The present chapter, along with Chaps. 5 and 6, provided a comprehensive treatment of thermal properties of and the transport processes in micro/nanostructured solid materials. This chapter focused on the transient and nonequilibrium heat conduction, when the local equilibrium condition is not satisfied to justify the conventional heat diffusion theory, based on Fourier’s law. Several modified phenomenological theories were critically reviewed with an emphasis on their application regimes. The phonon BTE was presented using the EPRT, and the solutions were discussed for the nonequilibrium heat transfer across a thin film or a multilayer structure. The basic models of TBR were outlined. Finally, a heat transfer regime was developed to assist readers in choosing an appropriate methodology for a given situation, with a brief summary on advanced multiscale modeling and measurement techniques.

REFERENCES 1. H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, 2nd ed., Clarendon Press, Oxford, 1959. 2. M. N. Özi¸sik, Heat Conduction, 2nd ed., Wiley, New York, 1993. 3. D. D. Joseph and L. Preziosi, “Heat waves,” Rev. Mod. Phys., 61, 41–73, 1989; D. D. Joseph and L. Preziosi, “Addendum to the paper ‘heat waves’,” Rev. Mod. Phys., 62, 375–391, 1990. 4. M. N. Özi¸sik and D. Y Tzou, “On the wave theory in heat conduction,” J. Heat Transfer, 116, 526–535, 1994. 5. W. K. Yeung and T. T. Lam, “A numerical scheme for non-Fourier heat conduction, Part I: Onedimensional problem formulation and applications,” Numer. Heat Transfer B, 33, 215–233, 1998. 6. A. Haji-Sheikh, W. J. Minkowycz, and E. M. Sparrow, “Certain anomalies in the analysis of hyperbolic heat conduction,” J. Heat Transfer, 124, 307–319, 2002. 7. J. Gembarovic and J. Gembarovic, Jr., “Non-Fourier heat conduction modeling in a finite medium,” Int. J. Thermophys., 25, 41261–41268, 2004. 8. M. B. Rubin, “Hyperbolic heat conduction and the second law,” Int. J. Eng. Sci., 30, 1665–1676, 1992. 9. C. Bai and A. S. Lavine, “On hyperbolic heat conduction and the second law of thermodynamics,” J. Heat Transfer, 117, 256–263, 1995. 10. A. Barletta and E. Zanchini, “Hyperbolic heat conduction and local equilibrium: A second law analysis,” Int. J. Heat Mass Transfer, 40, 1007–1016, 1997. 11. D. Jou, J. Casas-Vazquez, and G. Lebon, Extended Irreversible Thermodynamics, 2nd ed., Springer, Berlin, 1996. 12. A. Majumdar, “Microscale heat conduction in dielectric thin films,” J. Heat Transfer, 115, 7–16, 1993. 13. S. Volz, J.-B. Saulnier, M. Lallemand, B. Perrin, P. Depondt, and M. Mareschal, “Transient Fourier-law deviation by molecular dynamics in solid argon,” Phys. Rev. B, 54, 340–347, 1996. 14. J. Xu and X. Wang, “Simulation of ballistic and non-Fourier thermal transport in ultra-fast laser heating,” Physica B, 351, 213–226, 2004. 15. D. Y. Tzou, Macro- to Microscale Heat Transfer: The Lagging Behavior, Taylor & Francis, Washington DC, 1997. 16. P. J. Antaki, “Solution for non-Fourier dual phase lag heat conduction in a semi-infinite slab with surface heat flux,” Int. J. Heat Mass Transfer, 41, 2253–2258, 1998. 17. D. W. Tang and N. Araki, “Wavy, wavelike, diffusive thermal responses of finite rigid slabs to high-speed heating of laser-pulses,” Int. J. Heat Mass Transfer, 42, 855–860, 1999. 18. D. Y. Tzou and K. S. Chiu, “Temperature-dependent thermal lagging in ultrafast laser heating,” Int. J. Heat Mass Transfer, 44, 1725–1734, 2001.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

279

19. W. J. Minkowycz, A. Haji-Sheikh, and K. Vafai, “On departure from local thermal equilibrium in porous media due to a rapid changing heat source: The Sparrow number,” Int. J. Heat Mass Transfer, 42, 3373–3385, 1999. 20. W. Kaminski, “Hyperbolic heat conduction equation for materials with a nonhomogeneous inner structure,” J. Heat Transfer, 112, 555–560, 1990. 21. J. Callaway, “Model for lattice thermal conductivity at low temperatures,” Phys. Rev., 113, 1046–1951, 1959. 22. R. A. Guyer and J. A. Krumhansl, “Solution of the linearized phonon Boltzmann equation,” Phys. Rev., 148, 766–778, 1966; “Thermal conductivity, second sound, and phonon hydrodynamic phenomena in nonmetallic crystals,” Phys. Rev., 148, 778–788, 1966. 23. J. Shiomi and S. Maruyama, “Non-Fourier heat conduction in a single-walled carbon nanotube: Classical molecular dynamics simulations,” Phys. Rev. B, 73, 205420, 2006. 24. D. H. Tsai and R. A. MacDonald, “Molecular-dynamics study of second sound in a solid excited by a strong heat pulse,” Phys. Rev. B, 14, 4714–4723, 1976. 25. X. Wang and X. Xu, “Thermoelastic wave induced by pulsed laser heating,” Appl. Phys. A, 73, 107–114, 2001; X. Wang, “Thermal and thermomechanical phenomena in picosecond laser copper interaction,” J. Heat Transfer, 126, 355–364, 2004. 26. S. I. Anisimov, B. L. Kapeliovich, and T. L. Perel’man, “Electron emission from metal surfaces exposed to ultrashort laser pulses,” Sov. Phys. JETP, 39, 375–377, 1974. 27. T. Q. Qiu and C. L. Tien, “Short-pulse laser heating on metals,” Int. J. Heat Mass Transfer, 35, 719–726, 1992; T. Q. Qiu and C. L. Tien, “Size effect on nonequilibrium laser heating of metal films,” J. Heat Transfer, 115, 842–847, 1993; T. Q. Qiu, T. Juhasz, C. Suarez, W. E. Bron, and C. L. Tien, “Femtosecond laser heating of multi-layer metals—II. Experiments,” Int. J. Heat Mass Transfer, 37, 2799–2808, 1994. 28. J. L. Hostetler, A. N. Smith, D. M. Czajkowsky, and P. M. Norris, “Measurement of the electronphonon coupling factor dependence on film thickness and grain size in Au, Cr, and Al,” Appl. Opt., 38, 3614–3620, 1999. 29. S. Link, C. Burda, Z. L. Wang, and M. A. El-Sayed, “Electron dynamics in gold and gold-silver alloy nanoparticles: The influence of a nonequilibrium electron distribution and the size dependence of the electron-phonon relaxation,” J. Chem. Phys., 111, 1255–1264, 1999. 30. A. N. Smith and P. M. Norris, “Influence of intraband transition on the electron thermoreflectance response of metals,” Appl. Phys. Lett., 78, 1240–1242, 2001; R. J. Stevens, A. N. Smith, and P. M. Norris, “Measurement of thermal boundary conductance of a series of metal-dielectric interfaces by the transient thermoreflectance techniques,” J. Heat Transfer, 127, 315–322, 2005. 31. D. G. Cahill, K. Goodson, and A. Majumdar, “Thermometry and thermal transport in micro/nanoscale solid-state devices and structures,” J. Heat Transfer, 124, 223–241, 2002; D. G. Cahill, W. K. Ford, K. Goodson, et al., “Nanoscale thermal transport,” J. Appl. Phys., 93, 793–818, 2003. 32. D. M. Riffe, X. Y. Wang, M. C. Downer, et al., “Femtosecond thermionic emission from metals in the space-charge-limited regime,” J. Opt. Soc. Am. B, 10, 1424–1435, 1993. 33. A. N. Smith, J. L. Hostetler, and P. M. Norris, “Nonequilibrium heating in metal films: An analytical and numerical analysis.” Numer. Heat Transfer A, 35, 859–874, 1999. 34. D. Y. Zhou and K. S. Chiu, “Temperature-dependent thermal lagging in ultrafast laser heating,” Int. J. Heat Mass Transfer, 44, 1725–1734, 2001. 35. L. Jiang and H.-L. Tsai, “Energy transport and nanostructuring of dielectrics by femtosecond laser pulse trains,” J. Heat Transfer, 128, 926–933, 2006. 36. T. Klitsner, J. E. VanCleve, H. E. Fischer, and R. O. Pohl, “Phonon radiative heat transfer and surface scattering,” Phys. Rev. B, 38, 7576–7594, 1988. 37. R. B. Peterson, “Direct simulation of phonon-mediated heat transfer in a Debye crystal,” J. Heat Transfer, 116, 815–822, 1994. 38. E. T. Swartz and P. O. Pohl, “Thermal boundary resistance,” Rev. Mod. Phys., 61, 605–668, 1989. 39. W. A. Little, “The transport of heat between dissimilar solids at low temperatures,” Can. J. Phys., 37, 334–349, 1959. 40. G. Chen and C. L. Tien, “Thermal conductivity of quantum well structures,” J. Thermophys. Heat Transfer, 7, 311–318, 1993; G. Chen, “Size and interface effects on thermal conductivity of

280

41.

42. 43.

44. 45. 46. 47.

48.

49. 50.

51. 52. 53. 54. 55.

56. 57. 58. 59. 60.

61.

CHAPTER 7

superlattices and periodic thin-film structures,” J. Heat Transfer, 119, 220–229, 1997; G. Chen, “Thermal conductivity and ballistic-phonon transport in the cross-plane direction of superlattices,” Phys. Rev. B, 57, 14958–14973, 1998. G. Chen and T. Zeng, “Nonequilibrium phonon and electron transport in heterostructures and superlattices,” Microscale Thermophys. Eng., 5, 71–88, 2001; T. Zeng and G. Chen, “Phonon heat conduction in thin films: Impacts of thermal boundary resistance and internal heat generation,” J. Heat Transfer, 123, 340–347, 2001. S. Sinha and K. E. Goodson, “Review: Multiscale thermal modeling in nanoelectronics,” Int. J. Multiscale Comp. Eng., 3, 107–133, 2005. R. A. Escobar, S. S. Ghai, M. S. Jhon, and C. H. Amon, “Multi-length and time scale thermal transport using the lattice Boltzmann method with application to electronics cooling,” Int. J. Heat Mass Transfer, 49, 97–107, 2006. E. M. Sparrow and R. D. Cess, Radiation Heat Transfer, Augmented ed., McGraw-Hill, New York, 1978. M. F. Modest, Radiative Heat Transfer, McGraw-Hill, New York, 1993. M. M. Yovanovich, “Four decades of research on thermal contact, gap, and joint resistance in microelectronics,” IEEE Trans. Compon. Packag. Technol., 28, 182–206, 2005. P. E. Phelan, “Application of diffuse mismatch theory to the prediction of thermal boundary resistance in thin-film high-Tc superconductors,” J. Heat Transfer, 120, 37–43, 1998; L. De Bellis, P. E. Phelan, and R. S. Prasher, “Variations of acoustic and diffuse mismatch models in predicting thermal-boundary resistance,” J. Thermophys. Heat Transfer, 14, 144–150, 2000. H. J. Lee, Private communication, which provided a detailed derivation of the correct expression of the specularity p. A sequence of typos were found in Ziman’s book, Ref. [23] in Chap. 5, leading to the erroneous expression of p exp (16p3s2rms/l2). S. Mazumdar and A. Majumdar, “Monte Carlo study of phonon transport in solid thin films including dispersion and polarization,” J. Heat Transfer, 123, 749–759, 2001. C.-J. Twu and J.-R. Ho, “Molecular-dynamics study of energy flow and the Kapitza conductance across an interface with imperfection formed by two dielectric thin films,” Phys. Rev. B, 67, 205422, 2003. S. R. Phillpot, P. K. Schelling, and P. Keblinski, “Interfacial thermal conductivity: Insights from atomic level simulation,” J. Mater. Sci., 40, 3143–3148, 2005. Y. Chen, D. Li, J. R. Lukes, Z. Ni, and M. Chen, “Minimum superlattice thermal resistivity from molecular dynamics,” Phys. Rev. B, 72, 174302, 2005. H. Zhong and J. R. Lukes, “Interfacial thermal resistance between carbon nanotubes: Molecular dynamics simulations and analytical thermal modeling,” Phys. Rev. B, 74, 125403, 2006. J. D. Chung, A. J. H. McGaughey, and M. Kaviany, “Role of phonon dispersion in lattice thermal conductivity,” J. Heat Transfer, 126, 376–380, 2004. S. V. J. Narumanchi, J. Y. Murthy, and C. H. Amon, “Simulation of unsteady small heat source effects in sub-micron heat conduction,” J. Heat Transfer, 125, 896–903, 2003; S. V. J. Narumanchi, J. Y. Murthy, and C. H. Amon, “Submicron heat transport model in silicon accounting for phonon dispersion and polarization,” J. Heat Transfer, 126, 946–955, 2004. J. Lai and A. Majumdar, “Concurrent thermal and electrical modeling of sub-micrometer silicon devices,” J. Appl. Phys., 79, 7353–7361, 1996. P. G. Sverdrup, Y. S. Ju, and K. E. Goodson, “Sub-continuum simulation of heat conduction in silicon-on-insulator transistors,” J. Heat Transfer, 123, 130–137, 2001. S. Sinha, E. Pop, R. W. Dutton, and K. E. Goodson, “Non-equilibrium phonon distribution in sub100 nm silicon transistors,” J. Heat Transfer, 128, 638–647, 2006. M. R. Abel, T. L. Wright, W. P. King, and S. Graham, “Thermal metrology of silicon microstructures using Raman spectroscopy,” IEEE Trans. Comp. Pack. Technol., accepted 2007. J. Lee, T. Beechem, T. L. Wright, B. A. Nelson, S. Graham, and W. P. King, “Electrical, thermal, and, mechanical characterization of silicon microcantilever heaters,” J. Microelectromech. Syst., 15, 1644, 2007; J. Lee, T. L. Wright, M. R. Abel, et al., “Thermal conduction from microcantilever heaters in partial vacuum,” J. Appl. Phys., 101, 014906, 2007. K. Park, J. Lee, Z. M. Zhang, and W. P. King, “Frequency-dependent electrical and thermal response of heated atomic force microscope cantilevers,” J. Microelectromech. Syst., accepted 2007.

NONEQUILIBRIUM ENERGY TRANSFER IN NANOSTRUCTURES

281

62. K. Park, A. Marchenkov, Z. M. Zhang, and W.P. King, “Low temperature characterization of heated microcantilevers,” J. Appl. Phys., accepted 2007.

PROBLEMS What is the characteristic length for heat conduction along a thin film? Why is local equilibrium a good assumption in this case, even though the film thickness is less than the mean free path of the heat carriers? Why does the thermal conductivity depend on the thickness of the film? 7.2. Why do we say that Fourier’s law is a fundamental physical law, like Newton’s laws in mechanics, but the Cattaneo equation is not? Comment on the paradox of infinite speed of heat diffusion by considering the feasibility of exciting the surface temperature or depositing a heat flux to the surface instantaneously. 7.3. Consider a 1-D semi-infinite medium initially at uniform temperature Ti. The surface temperature is suddenly changed to a constant temperature, T(0,t) Ts. The analytical solution of the heat diffusion equation gives 7.1.

u(x,t)

T(x,t) Ti x Ts Ti erfca 2 2at b

For silicon at various temperatures, use the properties given in Example 5-6 to estimate how long it will take for a given location to gain a temperature rise that is u 1012, or one part per trillion of the maximum temperature difference. Estimate the average thermal diffusion speed in terms of x and Ti. [Hint: erfc(5.042) 1.00 1012.] 7.4. Repeat Problem 7.3, using copper instead of silicon as the material, based on the properties given in Example 5-5. Discuss why the average thermal diffusion speed is different under different boundary conditions, i.e., constant heat flux and constant temperature. From an engineering point of view, do you think heat diffusion is a fast or slow process? Why? 7.5. (a) Derive Eq. (7.4), the hyperbolic heat equation from the Cattaneo equation. (b) Derive Eq. (7.14), the lagging heat equation, based on the dual-phase-lag model. 7.6. Take GaAs as an example. How would you compare the speed of sound with the average thermal diffusion speed at different temperatures and length scales? This problem requires some literature search on the properties. 7.7. Assume the hyperbolic heat equation would work for transient heat transfer in glass (Pyrex), at near room temperature. Given k 1.4 W/(m # K), r 2500 kg/m3, cp 835 J/(kg # K), and va 5640 m/s. (a) At what speed would the temperature wave propagate? (b) For an excimer laser with a pulse width tp 10 ns, 0.1 ns after the pulse starts, could the hyperbolic equation be approximated by the parabolic equation? (c) Suppose we have an instrument available to probe the timescale below tq, will the hyperbolic heat equation be able to describe the observation? 7.8. Derive Eq. (7.13b) from Eq. (7.13a). Discuss the conditions for these equations to be reduced to Fourier’s law or the Cattaneo equation. 7.9. Show that Eq. (7.17) satisfies Eq. (7.16). Discuss the conditions for Eq. (7.17) to represent Fourier’s law or the Cattaneo equation. 7.10. Derive Eq. (7.18a), Eq. (7.18b), and Eq. (7.18c). 7.11. Derive Eq. (7.27a) and Eq. (7.27b). Calculate t, tq, and tT of copper, for Te 300, 1000, and 5000 K, assuming the lattice temperature Ts 300 K. 7.12. Calculate the electron-phonon coupling constant G for aluminum, copper, gold, and silver near room temperature. Discuss the dependence of k and G upon the electron and lattice temperatures Te and Ts. 7.13. At Te 1000, 3000, and 6000 K, estimate the energy transfer by thermionic emission from a copper surface, assuming that the electrons obey the equilibrium distribution function at Te. 7.14. Based on Example 7-3, evaluate the heat flux in a thin silicon film. How thin must it be in order for it to be considered as in the radiative thin limit? Calculate the medium temperature T. Plot the left-hand

282

CHAPTER 7

side and the right-hand side of Eq. (7.43). Furthermore, assuming Eq. (7.43) to be true for each frequency, find a frequency-dependent temperature T(v) of the medium. At what frequency does T(v) T ? Is there any physical significance of T(v)? 7.15. Derive Eq. (7.53), using Eq. (7.38), Eq. (7.49), and Eq. (7.50). 7.16. In principle, one should be able to study nonequilibrium electrical and thermal conduction in the direction perpendicular to the plane, and use the BTE to determine the effective conductivities. This could be a team project, in which a few students work together to formulate the necessary equations. As an individual assignment, describe how to set up the boundary conditions, as well as the steps you plan to follow, without actually deriving the equations. 7.17. For a diamond type IIa film, vl 17,500 m/s, vt 12,800 m/s, and k 3300 W/(m # K), near 300 K. Assume that the boundaries can be modeled as blackbodies for phonons. For boundary temperatures T1 350 K and T2 250 K, calculate and plot the heat flux qsx and the effective thermal conductivity keff across a film of thickness L, which varies from 0.05 to 50 m. 7.18. Calculate the TBR between high-temperature superconductor YBa2Cu3O7-d and MgO substrate, at an average temperature between 10 and 90 K, using both the AMM and the DMM without considering the electronic effect. The following parameters are given for YBa2Cu3O7-d: vl 4780 m/s, vt 3010 m/s, r 6338 kg/m3, and D 450 K; and for MgO: vl 9710 m/s, vt 6050 m/s, r 3576 kg/m3, and D 950 K. 7.19. Evaluate the effective thermal conductivity near room temperature of a GaAs/AlAs superlattice, with a total thickness of 800 nm, using the DMM to compute the transmission coefficient. Assume the end surfaces are blackbodies to phonons; consider that (a) each layer is 4 nm thick and (b) each layer is 40 nm thick. The following parameters are given, considering phonon dispersion on thermal conductivity, for GaAs: C 880 kJ/(m3 # K), vg 1024 m/s, and 145 nm; and for AlAs: C 880 kJ/(m3 # K), vg 1246 m/s, and 236 nm. How is the result compared with a single layer of either GaAs or AlAs? 7.20. Evaluate the effective thermal conductivity near room temperature of a Si/Ge superlattice, with a total thickness of 1000 nm, using the DMM to compute the transmission coefficient. Assume the end surfaces are blackbodies to phonons; consider that (a) each layer is 5 nm thick and (b) each layer is 50 nm thick. The following parameters are given, considering phonon dispersion on thermal conductivity, for Si: C 930 kJ/(m3 # K), vg 1804 m/s, and 260 nm; and for Ge: C 870 kJ/(m3 # K), vg 1042 m/s, and 199 nm. How is the result compared with a single layer of either Si or Ge?

CHAPTER 8

FUNDAMENTALS OF THERMAL RADIATION

Radiation is one of the fundamental modes of heat transfer. However, the concepts of thermal radiation are much more complicated and, hence, very difficult to perceive. The main features of radiation that are distinct from conduction and convection are as follows: (a) Radiation can transfer energy with and without an intervening medium; (b) The radiant heat flux is not proportional to the temperature gradient; (c) Radiation emission is wavelength dependent, and the radiative properties of materials depend on the wavelength and the temperature; and (d) The radiant energy exchange and the radiative properties depend on the direction and orientation.1,2 The dual theory explains the nature of radiation as either electromagnetic waves or a collection of particles called photons. Although radiation can travel in vacuum, it originates from matter. All forms of matter emit radiation through complicated mechanisms (e.g., molecular vibration in gases, and electron and lattice vibrations in solids). In most solids and some liquids, radiation emitted from the interior is strongly absorbed by adjoining molecules. Therefore, radiation from or to these materials is often treated as surface phenomena, while radiation in gases and some semitransparent solids or liquids has to be treated as volumetric phenomena. Nevertheless, one must treat solids or liquids as a medium (i.e., volumetrically) to understand the mechanisms of reflection and emission, to predict the radiative properties of thin films and small particles, and to calculate radiation heat transfer between objects placed in close vicinity. Thermal radiation refers to a type of radiation where the emission is directly related to the temperature of the body (or surface). There are numerous engineering applications where radiation heat transfer is important, such as furnaces, combustion, high-temperature materials processing and manufacturing, solar energy, space cooling and insulation, and cryogenic systems. Even at room temperature, radiative heat transfer may be of the same order of magnitude as convective heat transfer. The study of thermal radiation went along with the study of light phenomena and led to some major breakthroughs in modern physics. It is instructive to give a brief survey of major historical developments related to thermal radiation. Quantitative understanding of the nature of light began in the seventeenth century with the discoveries of Snell’s law of refraction, Fermat’s least-time principle of light path, Huygens’ principle of contructing the wavefront from secondary waves, and Newton’s prism that helped him prove white light consists of many different types of rays. In the dawn of the ninteenth century, Sir Frederick Herschel (1738–1822), a German-born English astronomer, discovered infrared radiation.3 His original objective was to find a suitable color for a glass filter, which could transmit most of light but the least amount of heat, for use in solar observations. By moving a thermometer along the spectrum of solar radiation that passed through a prism, Herschel accidently found that the temperature of the thermometer would rise even though it was placed beyond the red end of the visible light. He published several papers in Philosophical Transactions of the Royal Society of London in 283 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

284

CHAPTER 8

1800 and called the unknown radiation invisible light or heat-making rays. Young’s doubleslit experiment in 1801 demonstrated the interference phenomenon and the wave nature of light, followed by intensive studies on polarization and reflection phenomena, led by French physicist Augustin-Jean Fresnel (1788–1827) who contributed significantly to the establishment of the wave theory of light. In 1803, radiation beyond the violet end of the visible spectrum via chemical effects was also discovered. The ultraviolet, visible, and infrared spectra were thus associated with chemical, luminous, and heating effects, respectively. Yet, the common nature of the different types of radiation was not known until the late nineteenth century. One of the obstacles of accurately measuring infrared radiation (or heat radiation, as it was called in those days) was the lack of sensitive detectors. In the earlier years, measurements were performed using thermometers with blackened bulbs. In 1829, Italian physicists Leopoldo Nobili (1784–1835) and Macedonio Melloni (1798–1854) invented the thermopile, which is made by connecting a number of thermocouples in series, that is much more sensitive and faster than the thermometer. Melloni used the device to study the infrared radiation from hot objects and the sun. Gustav Kirchhoff (1824–1887), a German physicist, contributed greatly to the fundamental understanding of spectroscopy and the thermal emission by heated objects. In 1862, he coined the term “black body” radiation and established Kirchhoff’s law, which states that the emissivity of a surface equals its absorptivity at thermal equilibrium. Many famous physicists and mathmaticians have contributed to electromagnetism. The complete equations of electromagnetic waves were established in 1873 by Scottish physicist James Clerk Maxwell (1831–1879), and later confirmed experimentally by German physicist Heinrich Hertz (1857–1894), who discovered radio waves due to electrical vibrations. Before the existence of electrons had been proven, Dutch physicist Hendrik Lorentz (1853–1928) proposed that light waves were due to oscillations of an electric charge in the atom. He received the Nobel Prize in Physics in 1902, for his mathematical theory relating electron wave motion and light. The 1902 Nobel Prize was shared with his student Pieter Zeeman (1865–1943) for the experimental study about the effect of magnetic fields on atomic structures that has resulted in the splitting of spectral lines of the produced light. The electromagnetic wave theory has played a central role in radio, radar, television, microwave technology, telecommunication, thermal radiation, and physical optics. Albert Einstein arrived at the famous formula E mc2 in 1905, after connecting the relativity principle with the Maxwell equations. In 1881, Samuel Langley (1834–1906), the American astronomer, physicist, and aeronautics pioneer, invented a highly sensitive device called bolometer for detection of thermal radiation. The bolometer used two platinum strips, connected in a Wheatstone bridge circuit with a sensitive galvanometer, to read the imbalance of the bridge caused by the exposure of one of the strips to radiation. Langley was the first to make an accurate map of the solar spectrum up to a wavelength of 2.8 m. The Stefan-Boltzmann law of blackbody radiation is a result of the empirical relation obtained by Slovenian physicists Joseph Stefan (1835–1893) in 1879, based on observation of experiments, and the theoretical proof given by Austrian physicist Ludwig Boltzmann (1844–1906) in 1884, based on thermodynamic relations of a Carnot cycle with radiation as a working fluid using the concept of radiation pressure. In the late nineteenth century, German physicist Wilhelm Wien (1864–1928) derived the displacement law in 1893 by considering a piston moving within a mirrored empty cylinder filled with thermal radiation. Wien also derived a spectral distribution of blackbody radiation, called Wien’s formula, which is applicable to the short-wavelength region of the blackbody spectrum but deviates toward long wavelengths. Wien received the Nobel Prize in 1911 “for his discoveries regarding the laws governing the radiation of heat.” In 1900, Lord Rayleigh (1842–1919), British physicist and Nobel Laureate in Physics in 1904, used the equipartition theorem to show that the blackbody emission should be directly proportional to temperature but inversely proportional to the fourth power of wavelength. Sir James Jeans (1877–1946), British physicist, astronomer, and mathematician,

FUNDAMENTALS OF THERMAL RADIATION

285

derived a more complete expression in 1905. The Rayleigh-Jeans formula agreed with experiments at sufficiently high temperatures and long wavelengths, where Wien’s formula failed, but disagreed with experiments at short wavelengths. It is noteworthy that Rayleigh made great contributions to light scattering and wave phenomena, such as the discovery of Rayleigh scattering by small objects that explains why the sky is blue and the sunset appears orange glow. Rayleigh also predicted the existence of surface waves, sometimes called Rayleigh waves, which propagate along the interface between two different media. The amplitude of the wave, however, reduced in each media as the distance from the interface increases. In an effort to obtain a better agreement with measurements at long wavelengths, German physicist Max Planck (1858–1947) in 1900 used the maximum entropy principle, based on Boltzmann’s entropy expression, to derive an equation, known as Planck’s law, which agrees with experiments in the whole spectral region. Planck obtained his expression independently of Rayleigh’s work published several months earlier, while the complete derivation of Rayleigh-Jeans formula was obtained several years later. In his book The Theory of Heat Radiation, Planck showed that his formula would reduce to Wien’s formula at small lT and Rayleigh-Jeans formula at very large lT.4 In his derivation, Planck used a bold assumption that is controversial to classical electrodynamics. His hypothesis was that energy is not infinitely divisible but must assume discrete values, which are proportional to the frequency. This concept would have been easily accepted for a system consisting of particles, like atoms or gas molecules, but not for oscillators that radiate electromagnetic energy. Planck’s work opened the door to quantum mechanics. The idea of quantization of radiation was further developed by Einstein, who applied it to explain the photoelectric effect in 1905. Planck was awarded the Nobel Prize in Physics in 1918 for the discovery of energy quanta. In 1924, Indian mathematical physicist Satyendra Nath Bose (1894–1974) modified the Boltzmann statistics of ideal molecular gases, by treating photons as indistinguishable particles in order to derive Planck’s distribution function. With the help of Einstein, Bose’s work was published in Zeitschrift für Physik in 1924. Einstein further extended Bose’s theory to atoms and predicted the existence of a phenomenon, known as Bose-Einstein condensate, as discussed in Chap. 3. It is clear that the path of quest for the truth in understanding thermal radiation has led to important discoveries in modern physics. This chapter contains an introduction to the electromagnetic wave theory, blackbody radiation, plane wave reflection and refraction at the boundary between two semi-infinite media, and various models used to study the optical properties of different materials. The materials covered in the following sections are intended to provide a detailed background for more in-depth discussion on the applications to micro/nanosystems in subsequent chapters.

8.1 ELECTROMAGNETIC WAVES 8.1.1 Maxwell’s Equations The propagation of electromagnetic waves in any media is governed by a set of equations, first stated together by Maxwell. The macroscopic Maxwell equations can be written in the differential forms as follows:5–7 =E

'B 't

=HJ

= # D re =#B0

'D 't

(8.1) (8.2) (8.3) (8.4)

286

CHAPTER 8

Based on the SI units, E in V/m is the electric field, H in A/m is the magnetic field, J in A/m2 is the electric current density (i.e., electric charge flux), D in C/m2 is the electric displacement, B in Wb/m2 is the magnetic flux density (also called magnetic induction), and re in C/m3 is the charge density. Note that in magnetism, 1 tesla (T) 1 Wb/m2, and 1 weber (Wb) 1 V s. The charge conservation or continuity equation, = # J 're /'t 0, is implicitly included in the Maxwell equations, because it can be obtained by taking the divergence of Eq. (8.2) and then applying Eq. (8.3). The constitutive relations for a linear isotropic medium are D emE

(8.5)

B mmH

(8.6)

where em in F/m is the electric permittivity and mm in N/A2 is the magnetic permeability of the medium. Note that farad (F) is the SI unit of capacitance: 1 F 1 C/V. The permittivity and permeability values of free space (vacuum) are e0 8.854 1012 F/m and m0 4p 107 N/A2, respectively. For anisotropic media, em and mm are dyadic tensors. The microscopic form of Ohm’s law gives J sE

(8.7)

where s in A/(V m) is the electric conductivity. A brief discussion on the physical interpretation of Maxwell’s equations is given next. Equation (8.1) is an expression of Faraday’s law of induction, which states that a time varying magnetic field produces an electric field in a coil. In other words, through any closed electric field line, there is a time varying magnetic field. Combining Eq. (8.1) with Green’s theorem, Eq. (B.71), we see that the integral of the electric field around a closed loop is equal to the negative of the integral of the time derivative of the magnetic induction, over the area enclosed by the loop. Equation (8.2) is the general Ampere law, which includes Maxwell’s displacement current ('D/'t). It states that through any closed magnetic field line, there is an electric current density J or a displacement current or both. Conversely, circulating magnetic fields are produced by passing an electrical current through a conductor or changing electric fields or both. Equation (8.3) is Gauss’s law, which implies that the electric field diverges from electric charges. Using Gauss’s theorem, Eq. (B.70), it can be seen from Eq. (8.3) that the integral of the electric field over a closed surface is proportional to the electric charges enclosed by that surface. If there are no electric charges inside a closed surface, there is no net electric field penetrating the surface. Equation (8.4) is an analogy to Gauss’s law for magnetic field. However, since there exist no isolated magnetic poles, called magnetic monopoles, the integration of magnetic field over any closed surface is zero. The interpretations given in the preceding paragraph are straightforward since all variables and coefficients are considered as real quantities. However, Maxwell’s equations are mostly useful when all quantities are expressed in complex variables. The material properties, such as em and mm, are generally complex and frequency dependent. To facilitate the understanding, we will start with simple cases first and then generalize the theory for more realistic problems. 8.1.2 The Wave Equation Sometimes called free charge density, re in Eq. (8.3) should be treated as excess charges or net charges per unit volume. Because the number of electrons equals the number of protons in the nuclei, in most media, we can assume re 0. For a nonconductive material, s 0. We further assume that em and mm are both real and independent of position, time, and the field strength. This is true for a nondissipative (lossless), homogeneous, and linear material. If mm m0, the material is said to be nonmagnetic. Therefore, a nonconductive and

FUNDAMENTALS OF THERMAL RADIATION

287

nonmagnetic material is a dielectric for which only em is needed to characterize its electromagnetic behavior. Materials with both em and mm being real but mm 2 m0 are sometimes called general dielectrics or dielectric-magnetic media. Substituting the constitutive relations into Maxwell’s equations and then combining Eq. (8.1) and Eq. (8.2), we obtain =2E mmem

'2E 't2

(8.8)

where the vector identity given in Eq. (B.64), = (= E) =(= # E) =2E =2E, has been employed. Equation (8.8) is the wave equation, which can also be written in terms of the magnetic field. The wave equation has infinite number of solutions (see Problem 8.1). The solution of Eq. (8.8) for a monochromatic plane wave can be written as E E0ei(vtk # r)

(8.9)

where E0 is the amplitude vector, v is the angular frequency, r xx^ yy^ zz^ is the position vector, and k kx ^x ky ^y kz ^z is the wavevector, which points toward the direction of propagation. In order for Eq. (8.9) to be a solution of Eq. (8.8), the magnitude of k must be k v!mmem. The complex form of the electric field is used in Eq. (8.9) to facilitate mathematical manipulation. The actual electric field may be expressed as the real part of Eq. (8.9), viz., Re(E) Re(E0)cos f Im(E0)sin f

(8.10)

where Re or Im stands for taking the real part or the imaginary part, and f vt k # r is the phase. Equation (8.9) is a time-harmonic solution at a fixed frequency. Because any time-space-dependent function can be expressed as a Fourier series of many frequency components, we can integrate Eq. (8.9) over all frequencies to obtain the total electric field at any time and position. Therefore, understanding the nature of Eq. (8.9) is very important to the study of electromagnetic wave phenomena. When Eq. (8.9) is substituted into Maxwell’s equations, a time derivative '/'t can be replaced by a multiplication of iv and the operator = can be replaced by ik. Hence, the first two Maxwell equations can be written as k E vmmH and

k H vemE

(8.11a) (8.11b)

The two equations suggest that E, H, and k are orthogonal and form a right-handed triplet, when both em and mm are positive. On the surface normal to the wavevector k, the electric or magnetic field is a function of time only, because k # r const. This surface is called a wavefront. In the k direction, the wavefront travels at the speed given by c

v 1 k 2mmem

(8.12)

which is called phase speed, and it is the smallest speed at which the phase of the wave propagates.8 The phase velocity is the phase speed times the unit wavevector. Figure 8.1 illustrates a plane wave, propagating in the positive x direction, whose electric field is parallel to the y direction and magnetic field parallel to the z direction. In such cases, k kx and k # r kx. The wavefront is perpendicular to the x direction. It can be seen clearly that the wavevector is related to the wavelength lm in the medium by k 2p/lm. In free space, the speed of electromagnetic wave is given by c0 1/!m0e0. The speed of light in vacuum was instated as an exact number, c0 299, 792, 458 m/s, by the General Conference on Weights and Measures (abbreviated as CGPM for Conférence Générale des

CHAPTER 8

288

FIGURE 8.1 Illustration of a linearly polarized electromagnetic wave.

Poids et Mesures) in 1983. The SI base unit meter has since been defined as the distance that light travels in vacuum during a time interval of 1/299,792,458 s. The NIST reference on constant, units, and uncertainty can be found on the web page: http://physics.nist.gov/cuu/ index.html, which contains detailed discussions about the fundamental physical constants and the base SI units. For most calculations, it suffices to use c0 2.998 108 m/s. The refractive index of the medium is given as n !mmem /m0e0 c0 /c. Therefore, c c0 /n and lm l/n, where l is the wavelength in vacuum. For nonmagnetic materials mm /m0 1 ; thus, n !em /e0. Notice that n of a medium is a function of frequency (or wavelength) and is in general temperature dependent. For polychromatic light, the phase speed usually depends on wavelength because n n(l) in a dispersive medium. In vacuum, the energy propagation velocity is the same as the phase velocity. For polychromatic waves in a dispersive medium, the group velocity vg determines the direction and speed of energy flow and is defined as vg =kv

dv 'v ^ 'v ^ 'v ^ x

y

z dk 'kx 'ky 'kz

(8.13)

which is the gradient of v in the k-space. In a homogeneous and isotropic medium, vg c0 /(n v dn/dv) and the direction of the group velocity will be the same as that of the wavevector k. In a nondispersive medium, where n is not a function of frequency, it is clear that vg c c0 /n. When light is refracted from a nondispersive medium to a dispersive medium, the group velocity can have a component parallel to the group fronts, and hence, the energy flow is not necessarily perpendicular to the group fronts.8 Notice that the wave equation is also applicable to other types of waves such as acoustic waves, which are matter waves with a longitudinal and two transverse modes, as mentioned in Chap. 5. 8.1.3 Polarization A simple transverse wave will oscillate perpendicular to the wavevector. Because electromagnetic waves have two field vectors that can change their directions during propagation, the polarization behavior may be complicated. It is important to understand the nature of polarization in order to fully characterize an electromagnetic wave. There are two equivalent ways to interpret a complex vector A. The first method considers it as a vector whose components are complex, i.e., A Axx^ Ayy^ Az^z

(8.14)

FUNDAMENTALS OF THERMAL RADIATION

289

where Ax, Ay, and Az are complex numbers: Ax Arx iAsx ,

Ay Ary iAsy , and

Az Arz iAsz

(8.14a)

The second method decomposes it into two real vectors such that A Ar iAs

(8.15)

where Ar and As are the real and imaginary parts of the complex vector, given by Ar Arx x^ Ary y^ Arz ^z

and

As Asx x^ Asy y^ Asz ^z

(8.15a)

In either case, a complex vector has six real scalar terms. For the time being, let us assume all the material properties to have real values and k to be a real vector. Both E and H are complex, according to Eq. (8.9). To ensure that k # E 0 at any time and location, both Re(E0) and Im(E0) must be perpendicular to k. The same is true for the magnetic vector. Because H can be obtained from Eq. (8.11a), the state of polarization can be based on how the electric field varies in time and along the k direction in space. In order to study the time dependence of the electric field, rewrite Eq. (8.10) as Re(E) a cos(vt) b sin(vt)

(8.16)

where a Re(E0eik # r) and b Im(E0eik # r) are both real vectors and perpendicular to k. In general, the electric field will vary with time in an ellipse, called the vibration ellipse, as shown in Fig. 8.2. If a and b are parallel or, equivalently, Re(E0) and Im(E0) are parallel to

FIGURE 8.2 Illustration of polarization by the vibration ellipse, for a plane wave propagating in the positive z direction (out of the paper). The electric field vector is plotted at an increment of v t p/12.

each other, then the electric field will not change its directions. The wave is said to be linearly polarized, and either a or b specifies the direction of polarization. An example of a linearly polarized wave is the wave shown in Fig. 8.1. When a ' b and ZaZ ZbZ , the vibration ellipse is a circle and the wave is said to be circularly polarized. In general, a monochromatic wave

CHAPTER 8

290

described by Eq. (8.10) is elliptically polarized. For circularly or elliptically polarized light, if a b is in the same direction as k, the vibration ellipse will rotate counterclockwise (left-handed), as viewed toward the light source; and if a b is opposite to the direction of propagation, the vibration ellipse will rotate clockwise (right-handed).6,7 Similarly, one can consider the polarization of the electric field at a fixed time, and observe the vibration ellipse along the direction of propagation as an exercise (see Problem 8.2). Because of the random nature of thermal radiation, the Fourier component does not vary with time exactly following eivt but with some fluctuations in the amplitude. The polarization may become completely random, which is said to be unpolarized, randomly polarized, or completely uncorrelated. In any case, the electric field can be decomposed into the two orthogonal directions on the vibration ellipse. This is particularly useful for calculating energy transfer. A complete description of polarization is based on Stokes parameters, which are important in the study of light scattering and will be discussed in Chap. 9.

8.1.4 Energy Flux and Density The energy conservation for electromagnetic field can be obtained from Maxwell’s equations, according to English physicist John Poynting (1852–1914). To derive Poynting’s theorem, one can dot multiply Eq. (8.1) and Eq. (8.2) by H and E, respectively, and then add up each side. Using the vector identity in Eq. (B.63), we have = # (E H) (= E) # H (= H) # E. After simplifications, we obtain ' 1 1 = # (E H) a em E # E mmH # Hb E # J 't 2 2

(8.17)

The left-hand term represents the energy flow into a differential control volume, the first term on the right is the rate of change of the stored energy (associated with the electric and magnetic fields), and the last term is the dissipated electromagnetic work or Joule heating. The Poynting vector is defined as SEH

(8.18a)

The Poynting vector is essentially the energy flux, which gives both the direction and the rate of energy flow per unit projected surface area. Equation (8.17) and Eq. (8.18a) can be easily extended to the complex field notation. Although it is easy to write the Poynting vector (which is always real) as S Re(E) Re(H), it is not very helpful because one would have to evaluate the real parts of E and H individually. Besides, the frequency of oscillation is usually too high to be measured. For harmonic fields, the time-averaged Poynting vector can be expressed as kSl

1 Re(E H*) 2

(8.18b)

where * signifies the complex conjugate. Similarly, the time-averaged energy density for time-harmonic fields can be expressed as5 kul

1 1 e E # E* mmH # H* 4 m 4

(8.19)

For an absorbing or dissipative medium, a more complete description of the energy density can be found in Cui and Kong (Phys. Rev. B, 70, 205106, 2004).

FUNDAMENTALS OF THERMAL RADIATION

Example 8-1.

291

Prove that Eq. (8.18b) is the time-averaged Poynting vector for time-harmonic

fields. Let E E(r)eivt and H H(r)eivt, where E(r) and H(r) are complex vectors. Integrating the Poynting vector over a period T, we have Solution.

kSl

1 Re(E) Re(H)dt T 3T 1 SE(r)eivt E*(r)eivt T SH(r)eivt H*(r)eivt Tdt 4T 3T

1 1 (E H* E* H) Re(E H*) 4 2

8.1.5 Dielectric Function The conductivity is large at low frequencies for metals, due to free electrons. Even for good conductors, however, the electrons are not completely free but will be scattered by defects and phonons. At high frequencies, the current density J and the electric field E are not in phase anymore, suggesting that the conductivity should be a complex number. For insulators such as crystalline or amorphous dielectrics, electromagnetic waves can interact with bound electrons or lattice vibrations to transfer energy to the medium. At optical frequencies, the distinction between a conductor and an insulator becomes ambiguous unless the optical response over a large frequency region is considered. For example, a dielectric material can be highly reflective at a certain frequency region in the mid-infrared. On the other hand, a good conductor will be highly reflective in a much broader wavelength region from the near-infrared to the microwave. Let us first take the conductivity and the permittivity to be real, for a nonmagnetic material. The wave equation for s 2 0 and mm m0 has the following form: =2E m0s

'E '2E

m0em 2 't 't

(8.20)

Suppose Eq. (8.9) is a solution of this equation. We can substitute 'E/'t ivE, '2E/'t2 v2E, and =2E k2E into Eq. (8.20) to obtain k2 ivm0s v2m0em

(8.21)

Therefore, the wavevector becomes complex: k kr iks , where kr krx ^x kry y^ krz ^z and ks ksx ^x ksy y^ ksz ^z are real vectors. Note that Eq. (8.21) tells us the value of k2 k # k k2x k2y k2z , where each wavevector component may be complex, but does not specify the individual components. The complex dielectric function is defined as em s e er ies e i ve 0 0

(8.22)

For a nonmagnetic material, the complex refractive index ~ n n ik is related to the complex dielectric function by e (n ik)2. The imaginary part k of the complex refractive index is called the extinction coefficient. By definition, we have er n2 k2 and

es 2nk

(8.23)

The refractive index n and the extinction coefficient k are also called optical constants,9 although none of them are constant over a large wavelength region for real materials. The

292

CHAPTER 8

dielectric function is also called relative permittivity, with respect to the permittivity of vacuum e0. One can consider the s/v term in Eq. (8.22) as the imaginary part of the permittivity. Some texts used e er ies for the dielectric function and ~ n n ik for the complex refractive index. In doing so, Eq. (8.9) must be revised to E E0ei(vtk # r). In either convention, e and s must be nonnegative for a passive medium. Equation (8.21) can be rewritten as k~ nv/c0 (8.24) For simplicity, we will remove the tilde and simply use n for the complex refractive index, where it can be clearly understood from the context. By substituting ik for = and iv for '/'t, we can rewrite Maxwell’s curl equations as

and

k E vm0H

(8.25)

k H ve0eE

(8.26)

Similar to the definition of the complex dielectric function, one may choose to define a ~ E, where complex conductivity that satisfies Ohm’s law at high frequencies, J s ~ sr iss s ive s m

(8.27)

~ . Therefore, because we have assumed that s is the real part of s ss ve0er

and

es sr/ve0

Equation (8.26) can be recast in terms of the complex conductivity as ~E k H is

(8.28)

(8.29)

In the subsequent discussion, we will omit the tilde above s, when the context is sufficiently clear. The complex conductivity and the complex dielectric function are related to each other. For a linear, isotropic, and homogeneous nonmagnetic material, only two frequencydependent functions are needed to fully characterize the electromagnetic response. The function pairs often found in the literature are (n,k), (sr,er ), (er,es) , and (sr,ss) . The principle of causality, which states that the effect cannot precede the cause, or no output before an input, imposes additional restrictions on the frequency dependence of the optical properties so that the real and imaginary parts are not completely independent, but related to each other. In general, the relative permeability, which is complex and frequency dependent, can be expressed as m mr ims mm/m0

(8.30)

The complex refractive index for magnetic materials should be defined as follows: n 2em

(8.31)

The amplitude of the complex wavevector is k nv/c0, the same as Eq. (8.24). One can verify that Eq. (8.9) is a solution of the wave equation. The relative permittivity e and permeability m will be used to formulate the general equations later in this chapter. In most sections of this chapter, we deal with nonmagnetic materials, such as metals, dielectrics, and semiconductors. However, we will devote the discussion of the optical properties of magnetic materials in Sec. 8.4.6, because of the emerging interest in metamaterials, which are synthesized materials with magnetic response at the microwave and higher frequencies (see Problem 8.6, for example).

FUNDAMENTALS OF THERMAL RADIATION

293

8.1.6 Propagating and Evanescent Waves In an absorbing nonmagnetic medium, the electric and magnetic fields will attenuate exponentially. As an example, consider a wave that propagates in the positive x direction, with its electric field polarized in the y direction. Then, E y^E0ei(vtkrx)eksx

(8.32)

where kr vn/c0 and ks vk/c0 are the real and imaginary parts of the wavevector, respectively; that is, k (kr iks)x^. Equation (8.32) suggests that the amplitude of the electric field will decay exponentially according to e(2pk/l)x. The magnetic field can be obtained from Eq. (8.25) as n ik H z^ m c E0ei(vtkrx)eksx 0 0

(8.33)

By substituting Eq. (8.32) and Eq. (8.33) into Eq. (8.18b), we obtain the time-averaged energy flux in the x direction as kSl

n n E 2e2ksx E 2ealx 2m0c0 0 2m0c0 0

(8.34)

where al 4pk/l is called the absorption coefficient. The inverse of al is called the radiation penetration depth (or photon mean free path) given by l 1 dl a 4pk l

(8.35)

It is the distance through which the radiation power is attenuated by a factor of 1/e (1lmp eb,l(l,T)dl. The numerical result is approximately 1:3 and independent of lmp

`

temperature. For a medium of refractive index n, the speed of light c should be replaced by c0 /n in Eq. (8.43). In the previous discussion, we have assumed a nondispersive medium with n ; 1, which is true for vacuum only. Corrections are rarely needed if the medium is a gas, but would be necessary for radiation inside solids or liquids. Furthermore, in a dispersive medium, the group velocity needs to be considered in deriving the density of states D(n) in Eq. (8.40) and the energy flux in Eq. (8.42); see Prasher (Appl. Phys. Lett., 86, 071914, 2005). Assuming the sun to be a blackbody at 5800 K, calculate the emitted power at the following wavelength intervals: l 0.3 µm, 0.3 m l 0.4 µm, 0.4 µm l 0.7 µm, 0.7 µm l 3 µm, and l 3 µm. Neglect the absorption by the atmosphere. What is the radiant power arriving at the earth’s surface from the sun? Example 8-3.

4 Solution. The total emissive power is sSBTsun 5.67 108 58004 < 64 MW/m2. We can obtain the emitted power in each spectral region by integrating Eq. (8.44), as listed in the following table. Note that Fl1Sl2 represents the fraction of radiation falling between l1 and l2.

l(µm) l2T (µm K) F0Sl2 Fl1Sl2

Eb(MW/m2)

0.3 1740 0.03 0.03 1.9

0.3–0.4 2320 0.12 0.09 5.8

0.4–0.7 4060 0.49 0.37 23.7

0.7–3 17400 0.98 0.49 31.4

3 ∞ 1 0.02 1.3

Total _ _ 1 64.1

The total power emitted by the sun equals the emissive power multiplied by the surface area of the sun. The fraction of the power that reaches the earth equals the solid angle of the earth divided by 4p. Note that the radius of the sun rsun 6.955 108 m, the radius of the earth rearth 6.378 106 m, and the earth-sun distance Rearth-sun 1.496 1011 m. Therefore, the total power that will reach the earth’s surface, if the absorption by the atmosphere is neglected, is . 2 # 4 # Q 4prsun sSBTsun

2 prearth 2 4pRearth-sun

< 1.8 1017 W

. 2 The average irradiation on the earth is: G Q/prearth < 1377 W/m2. This value is very close to the total solar irradiance (TSI), measured outside the earth’s atmosphere.

Because of the broad spectral region of electromagnetic waves, alternative units are often used, such as wavelength l (in vacuum), wavenumber n 1/l, frequency n c0/l, angular frequency v 2pn, and photon energy E hn. Generally speaking, optical radiation covers the spectral region of ultraviolet (UV), visible (VIS), near-infrared (NIR),

CHAPTER 8

298

TABLE 8.1 Spectral Regions Expressed in Different Units UV from–to

VIS up to

Wavelength, l (m)* 0.01–0.38 0.76 Wavenumber, n (cm1) 106–(2.6 104) 1.3 104 Frequency, n (THz) (3 104)–790 395 Angular frequency, v (2 105)–(5 103) 2.5 103 (rad/s) Photon energy, E (eV)† 124–3.3 1.63

NIR up to

MIR up to

FIR up to

MW up to

2.5 4 103 120 750

25 400 12 75

103 10 0.3 1.9

105 0.1 3 103 0.02

0.5

0.05

1.2 103 1.2 105

*

The wavelength will be reduced in a medium whose refractive index n is not unity. The conversion from the vacuum wavelength l in m to the photon energy E in eV is E 1.240/l.

†

mid-infrared (MIR), and far-infrared (FIR). Table 8.1 outlines the subdivisions of the spectral region in different units from ultraviolet (UV) to microwave (MW).

8.2.2 Radiation Thermometry The developments of the absolute temperature scale and radiation thermometry are among the most important applications of blackbody radiation. The Stefan-Boltzmann law eb sSBT 4 defines an absolute thermodynamic temperature, which is consistent with the one defined by the ideal-gas law and the Carnot cycle. While radiation thermometry can serve as a primary standard, most practical radiation thermometers are not absolute instruments because of other considerations such as fast response, easy operation, and low cost. Hightemperature furnaces are commonly used as calibration standards. The cavity is a hollow cylinder, made of graphite for example, with a conical ending and a small aperture. The most accurate calibration source is the fixed-point heat pipe blackbody, for which a pure metal is melted outside the graphite cylinder to maintain a constant temperature in a twophase state. The freezing temperatures are then used to define the temperature scales (1234.93 K for Ag, 1337.33 for Au, and 1357.77 K for Cu). To measure the absolute temperature of a thermally radiative body, two blackbody cavities at different temperatures would be needed: one serves as the emitter (blackbody source) and the other as the receiver (radiometer). Quinn and Martin used a blackbody source and a cryogenic radiometer to directly determine the thermodynamic temperatures and measure the Stefan-Boltzmann constant.10 The experimentally obtained Stefan-Boltzmann constant was (5.66967 0.00076) 108 W/(m2 # K4). The difference is 0.13% of the theoretical value (5.67040 0.00004) 108 W/(m2 # K4), based on Planck’s constant, Boltzmann’s constant, and the speed of light. Since the early 1990s, the National Institute of Standards and Technology (NIST) has developed a high-accuracy cryogenic radiometer (HACR) facility to serve as the primary standard for optical radiation measurements. A schematic of the original HACR receiver is shown in Fig. 8.6. The receiver is mounted at the bottom of a liquid-helium cryostat in an evacuated chamber, and the optical access is through a Brewster window below the cavity. The HACR facility has gone through some major upgrades in recent years. The receiver cavity is made of copper with a high thermal conductivity and low specific heat at cryogenic temperatures. The inner wall of the cavity is coated with a specular black paint to absorb the incident radiation with an effective absorptance greater than 99.998%. The electrical-substitution technique links the radiant power to the electric power to achieve an overall uncertainty within 0.02% for optical power measurements. Detailed descriptions can be found from Pearson and Zhang and references therein.11 The cosmic microwave background radiation, measured with cryogenic bolometers, can be

FUNDAMENTALS OF THERMAL RADIATION

299

FIGURE 8.6 Schematic of the receiver cavity of an absolute cryogenic radiometer, where GRT stands for germanium resistance thermometer, from Pearson and Zhang.11

fitted to the blackbody distribution at 2.7 K, which is the temperature of the universe at the present time. The discovery of cosmic radiation background in 1964 and the subsequent measurements and theoretical studies have been recognized by the Nobel Prizes in Physics to Arno Penzias and Robert Wilson in 1978 and to John Mather and George Smoot in 2006. Most radiation thermometers are based on spectral measurements rather than on the measurement of the total irradiance from the target. When a radiation thermometer is used to measure the temperature of a real surface, the unknown emissivity of the surface and the influence of the surrounding radiation are the major issues that affect the measurement. Various methods have been developed to deal with these problems, including the creation of a blackbody cavity on the surface, the two-color method, and the use of a

300

CHAPTER 8

controlled reference source.12 The development of optical fibers has allowed radiometric temperature measurements for surface locations that are otherwise inaccessible by imaging radiometers. The measurement equation of a spectral radiation thermometer can be approximated as follows: Vd CI Iex,l(l)

(8.46)

where Vd is the detector output signal and CI is an instrument constant that is independent of the target material and temperature. The term Iex,l(l) is called the exitent spectral radiance, which includes the radiation emitted by the target and the surroundings, as well as that reflected by the target. The radiance temperature Tl (also called the brightness temperature) is defined according to Ib,l(l,Tl) Iex,l(l)

(8.47)

where Ib,l(l,Tl) is the blackbody intensity at the wavelength l and temperature Tl. If the surrounding emission and absorption can be neglected, the exitent spectral radiance is due only to the emission; therefore, Iex,l(l) Ie,l(l,T ) erl Ib,l(l,T )

(8.48)

where erl is the directional-spectral emissivity, and Ie,l(l,T ) is the intensity emitted by the target. By combining Eq. (8.47) and Eq. (8.48) and applying Wien’s formula, the surface temperature is related to the radiance temperature by l 1 1 ln erl

T Tl C2

(8.49)

The uncertainty in the measured temperature due to an uncertainty in the emissivity is lT derl dT T C2 erl

(8.50)

The effect of the emissivity uncertainty on the temperature accuracy decreases as l decreases. However, the wavelength at which Ib,l(l,T ) is a maximum is given by Wien’s displacement law. In practice, the choice of the operating wavelength should also be based on the material’s properties and the surrounding radiation, and requires a detailed analysis of different effects. If the surrounding radiation is not negligible, Iex,l(l) is the sum of the emitted and reflected spectral radiances, and may be affected by participating medium emission and absorption. Rapid thermal processing is a semiconductor single-wafer manufacturing technique. Lightpipe radiation thermometer, at l 0.95 m, is used to measure the wafer temperature. The emissivity of a plain silicon wafer is approximately 0.7 at this wavelength. Neglect the reflected radiation from the wafer. If the wafer is at a temperature of 1200 K, what is the radiance temperature? If the temperature needs to be determined within an uncertainty of 1 K, how much tolerance on the emissivity error is acceptable? Example 8-4.

Solution. From Eq. (8.49), Tl < 1167 K, which differs from the actual temperature by approximately 33 K. One can also solve Eq. (8.47) and Eq. (8.48), using Planck’s law, and the result is essentially the same. Based on Eq. (8.50), to obtain a temperature within an uncertainty of 1 K, the emissivity must be determined within an uncertainty of derl 0.0074. Zhou et al. (Int. J. Heat Mass Transfer, 45, 1945, 2002) developed a model to predict the effective emissivity of silicon wafers in rapid thermal processing furnaces and showed that the temperature measurement uncertainty can be significantly reduced by using a reflective cavity.

FUNDAMENTALS OF THERMAL RADIATION

301

8.2.3 Entropy and Radiation Pressure Like other particles, the photon gas also has the property of entropy and can be related to other properties in equilibrium states. Express the energy density in an enclosure of volume V, at thermodynamic equilibrium, with a temperature T as u U/V 4sSBT 4/c. It can be seen that the specific heat at constant volume is cv ('u/'T )V 16sSBT 3/c. The entropy can therefore be obtained as T

16 dT Vs T 3 S 3 Vcv T 3c SB 0 or

s

16 s T3 3c SB

(8.51a) (8.51b)

Note that T ('U/'S)V is satisfied. The Helmholtz free energy A U TS 34 VsSBT 4/c. Thus, the radiation pressure is P a

'A 4 b sSBT 4 'V T 3c

(8.52)

The force by the radiation pressure, albeit small, has some important applications in trapping and manipulating atomic and molecular particles. This technique is called optical traps or optical tweezers; see Lang and Block (Am. J. Phys., 71, 201, 2003) for a bibliographical review. If each photon mode (frequency) is individually considered, the spectral entropy density for unpolarized radiation can be expressed as follows: sn(n,T )

8pkBn2 3

c

c

x ex

lna x bd e 1 e 1 x

(8.53)

where x hn/kBT 4. Note that 1/T ('sv /'uv)n (kB /hn) ln (1 8phn3/uvc3), which is consistent with Eq. (8.41). Similar to the energy flux (emissive power) and intensity, the radiation entropy flux can be obtained by multiplying a factor c/4 to Eq. (8.51b) and Eq. (8.53), and the radiation entropy intensity can be obtained by dividing the flux by p, because of the isotropic nature of blackbody radiation. Clearly, electromagnetic radiation carries both energy and entropy. Consider the radiation heat transfer between two parallel plates at T1 and T2, respectively. Assume each plate has an area of A and both plates are blackbodies. The separation distance is much smaller than !A but much greater than the wavelength of thermal radiation. (a) How much entropy is generated at each plate? Evaluate the ratio of entropy generation assuming that T1 2T2. (b) If a thermophotovoltaic receiver is mounted on the lower-temperature side to convert thermal radiative energy to electricity (work), what is its maximum achievable efficiency? . Solution. (a) The net energy flow from. plate 1 to 2 is Q12 AsSB(T14 T24). The entropy of plate 1 will decrease at the rate of dS1/dt Q12/T1, and the entropy of plate 2 will increase at the rate of . dS2/dt Q12/T2. On the other hand, the net entropy flow from plate 1 to 2 can be calculated as . . . . . 1 S12 43 AsSB(T 13 T 23). Therefore, Sgen,1 Q12 / T1 S12 AsSB A 3 T13 34 T23 T24/T1 B , Sgen,2 . 1 3 4 3 AsSB A 3 T2 3 T1 T14/T2 B , and the combined total entropy generation is equal to Q12(1/T2 1/T1), as expected. It can be shown that the entropy generation at each plate is always greater than zero if T1 2 T2, or equal to zero if T1 T2. When T1 2T2, the entropy generation by plate 1 is about onequarter and that by plate 2 is about three-quarters of the total entropy generated. Example 8-5.

CHAPTER 8

302

(b) The available energy or exergy of thermal radiation is defined as the maximum work that can be produced by a system with respect to a large reservoir. In the present example, we may assume that the reservoir is at the same temperature as T2. Suppose an amount of heat is taken from the hightemperature plate; we would like to find out the maximum work that can possibly be produced. Let us consider a reversible heat engine at T2. The radiative energy leaving surface 1 can still be described . . 4 by Q1 AsSB (T 41 T 42 ), and the entropy leaving surface 1 is S1 3 AsSB(T 13 T 32). Therefore, the entropy generation in plate 1 cannot be eliminated. In other words, it is impossible to achieve the Carnot efficiency of Carnot 1 T2/T1. The maximum work can be obtained when the irreversibility at the lower-temperature plate is negligible . and the . heat engine is also reversible. It can easily be shown that the maximum work Wmax Q1 T2S1, and the optimal efficiency is given by opt

. Wmax 4(1 y y2) . 1 3(1

y)(1 y2) Q1

(8.54)

where y T1/T2 0. When y 2, we obtain an optimal efficiency h opt 37.8% , which is less than the Carnot efficiency of 50%, because of the unrecoverable irreversibility at plate 1. A comprehensive discussion can be found from the review of Landsberg and Tonge.13

The next question is whether temperature can be defined for laser radiation. The answer is yes, and the temperature for high-intensity lasers can be very high. An intuitive guess is to define the temperature, based on the intensity In of the laser or the monochromatic radiation, by setting In Ib,n(n,Tn). The definitions of entropy and thermodynamic temperature for optical radiation are very important for analyzing optical energy conversion systems, such as solar cells, thermophotovoltaic generators, luminescence devices, and laser cooling apparatus.13,14 Assume that the monochromatic radiation is from a thermodynamic equilibrium state, such as a resonance cavity that allows only a single mode to exist. The spectral entropy intensity of unpolarized radiation can be written as follows:13 Ln

2kBn2 c2

c a1

c2In c2In c2In blna1

b lna bd 2hn3 2hn3 2hn3 2hn3 c2In

(8.55)

Thermodynamically, the monochromatic radiation temperature can be defined by 'Ln kB 2hn3 1 b lna1 2 b a 'In n Tn(n) hn c Iv

(8.56)

This is indeed Planck’s distribution of intensity at the same temperature. The expressions can be modified for polarized radiation. When the energy intensity is very high, Eq. (8.56) approaches Tn(n) c2Iv /(2kBn2), which is in the Rayleigh-Jeans limit. The radiation temperature will be proportional to the intensity of the monochromatic radiation and can exceed 1010 K, with a 1-mW He-Ne laser at 632.8-nm wavelength.15 Therefore, for lasers with a moderate intensity, Tn tends to be so high that the entropy is nearly zero; hence, the interaction of a laser beam with a material can be considered as work interaction. If a collimated beam is randomly scattered by a rough surface, the scattered radiation will have a much lower intensity because of the increase in the solid angle. The process is accompanied with an entropy increase and is thus irreversible. It is not possible to increase the intensity of the scattered light, back to their original intensity, without leaving any net effect on the environment of the photon system. On the other hand, if a nearly collimated light is split into two beams with a beamsplitter, the transmitted and reflected beams can interfere with each other to reconstruct the original beam. This process is reversible because the two beams are correlated. The correlated beams have lower entropy than those with the same intensity at thermodynamic equilibrium. The concept of temperature is applicable only if the maximum-entropy state has been reached.15 While the definition of the monochromatic radiation temperature is similar to that of the radiance temperature, the two concepts are quite different. In the definition of radiance temperature, the quality (entropy) does not

FUNDAMENTALS OF THERMAL RADIATION

303

enter into play. On the other hand, the definition of the monochromatic temperature for incoherent radiation is for a state that is equilibrium in a certain wavelength and angular ranges. Consider a gray-diffuse body, for which the emissive power is proportional to the blackbody emissive power, at any frequency and angle of emission. The monochromatic temperature calculated from Eq. (8.56), however, is frequency dependent. This is because the emitted radiation, as a whole, cannot be considered as a blackbody at any temperature. Thermal radiation of this type has been called dilute blackbody radiation.13 This simple example shows that photons at any given frequency can be considered as in a thermodynamic equilibrium but not necessarily in equilibrium with photons at other frequencies. When radiation has two linear polarizations with a different intensity, the monochromatic temperature will be different, even for the two polarizations. In general, it is a function of frequency, direction, and polarization. The requirement is that each subsystem be in a thermodynamic equilibrium, even though it is not in equilibrium with other subsystems at the same spatial location. Photons at different frequencies, with different polarization states, or propagating toward different directions, can coexist in their own equilibrium state without any interaction with each other. The concept may be called partial equilibrium, as in the case when the two parts of a cylinder were separated by a moveable adiabatic wall. The mechanical equilibrium would be established to maintain the same pressure on each side, but the temperatures may be different from each other because thermal equilibrium is reached only inside each portion but not between them. Another example is in ultrafast laser heating of metals, as discussed in Chap. 7, where the electron and phonon systems can be treated as in separate equilibrium states but not in equilibrium with each other. The concept of entropy intensity has recently been applied by Caldas and Semiao to study the entropy generation in an absorbing, emitting, and scattering medium, based on the equation of radiative transfer (ERT) introduced in Sec. 2.4.3.16 The key is that the change in entropy in an elemental path length equals the change in intensity divided by the radiation temperature. The entropy change at steady state can be obtained from Eq. (2.53) in Chap. 2 as follows: alIb,l Il(r) (al sl)Il sl dLl (r, )dr

Tl(Il) Tl(Il) 4p 34p Tl(Il) dj

(8.57)

Like Il, the entropy intensity Ll is a function of wavelength, location, and direction. Note that Ib,l Ib,l(l,Tg), where Tg is the local temperature. For an anisotropic radiation field, Tl(Il) would be different for different directions. For nonblackbody radiation, Tl(Il) will be a function of wavelength. The term Il/Tl(Il), however, is not the same as Ll. Integration of Eq. (8.57) over the solid angle of 4p at all wavelengths in a volume element yields the entropy that is transferred out of the control volume. Furthermore, the entropy change in the control volume is equal to the total energy absorbed divided by Tg. The energy rate received per unit volume can be expressed as `

. q 3 3 al(Il Ib,l)d dl

(8.58)

0 4p

Because the entropy change is the sum of the net entropy transferred into the system and the entropy generation by irreversibility, we can express the volumetric entropy generation rate as `

`

al sl al 1 1 . sgen 3 3 alIb,l c d d d l 3 3 c d Ild dl Tg Tg Tl(Il) Tl(Il) 0 4p

0 4p `

sl Il(r)

3 3 (r,)dr d ddl c 4p 3 Tl(Il) 0 4p

4p

(8.59)

CHAPTER 8

304

For an isotropic field, Il is independent of the direction, and scattering does not contribute to the entropy generation. In this case, the entropy generation becomes `

. 1 1 sgen 3 3 al AIb,l Il B c d ddl Tg Tl(Il)

(8.60)

0 4p

The entropy generation is always greater than zero, because the intensity is an increasing function of temperature, unless the medium is at thermal equilibrium. When a surface is involved in radiative heat transfer, the entropy generation rate per unit area can be expressed as ` 2p p/2

ssgen 3 3 3 c 0 0 0

Iin,l Iout,l (Lin,l Lout,l)d cos u sin u dudfdl Tw

(8.61)

where Tw is the wall temperature, subscripts “in” and “out” signify the energy or entropy intensity to and from the surface, respectively. If the surface is not a blackbody, the outgoing intensity includes both the emitted and reflected intensities. An alternative approach is to integrate the intensity over the whole sphere with a solid angle of 4p. In Eq. (8.61), the entropy intensity is related to the energy intensity by Eq. (8.55), which is recast in terms of wavelength as follows: Ll(l,Il)

2kBc l4

c a1

l5Il l5Il l5Il blna1

b lna bd 2hc2 2hc2 2hc2 2hc2 l5Il

(8.62)

The use of Eq. (8.62) may be disputed when multiple reflections occur. The intensity of the emitted radiation is less than that of the blackbody and is reduced by each reflection. The question still remains as whether the blackbody intensity should be used to calculate the entropy or the actual intensity after each reflection or the combined intensity at any given location. An example is a system of two large parallel plates, separated by vacuum. One of the plates is at a temperature T1 and is diffuse-gray with an emissivity of 0.5. The other plate is insulated and is a perfect reflector (i.e., zero emissivity). It is clear that a thermal equilibrium will be established in the cavity after a long time. Again, the separation distance is much larger than the thermal radiation wavelengths. The radiation leaving surface 1 includes the emitted rays, as well as the first-order and higher-order reflected rays. An attempt to define the entropy of the emitted ray and each reflected ray will result in a total entropy intensity greater than the entropy intensity calculated based on the blackbody intensity Ib,l(l,T1). Therefore, to apply the previous analysis in a consistent way and to obtain meaningful results, we must make the following hypotheses: • The intensity at any given location is additive regardless of where it originates from, as long as it falls within the same solid angle and wavelength intervals. While this sounds obvious, it is untrue when interference effects become important. The resulting intensity is called the combined intensity. • The monochromatic radiation temperature Tl, defined in Eq. (8.56), is a function of the combined intensity and is in general dependent on the direction and wavelength. The effect of polarization is neglected to simplify the problem. Equation (8.56) must not be applied to each of the reflected or scattered rays. The physical significance is that all the photons, with the same wavevector and frequency, can be considered as a subsystem that is at thermodynamic equilibrium with the temperature Tl[Il(l,u,f)]. • The entropy intensity is defined based on the combined intensity, according to Eq. (8.62). While entropy must be additive, the entropy of all individual rays must be calculated

FUNDAMENTALS OF THERMAL RADIATION

305

based on the monochromatic temperature of the combined intensity. Because the number of photons, intensity, and entropy are additive, the fraction of the entropy of each ray is the same as the ratio of the intensity of that ray to the combined intensity. With the theories presented in this section, one should be able to perform a second law thermodynamic analysis for a given system, involving radiative transfer of energy. Zhang and Basu investigated entropy flow and generation considering incoherent multiple reflections.17 There exist different approximations in analyzing the entropy of radiation. For example, the method of dilute blackbody radiation uses a dilution factor and defines an effective temperature for each wavelength.13 When the process is very complicated, it appears that such an effective temperature cannot be easily defined and this definition cannot be applied to multiple reflections. Entropy generation is usually accompanied by the generation of heat, such as heating by friction, electrical resistor, chemical reaction, or absorption of solar radiation. On the other hand, it appears that entropy generation can occur in radiation without the generation of heat, such as by scattering. The definition of inelastic scattering is based on the conservation of energy (wavelength) and momentum, which does not impose any constraints on the reversibility. Further research is much needed in order to better understand the nature of entropy of radiation and determine the ultimate efficiency of photovoltaic cells and other radiative processes, including laser cooling and trapping. Another area of possible application of radiation entropy is in nanoscale heat conduction using the EPRT, as discussed in Chap. 7. The entropy concept may be extended to the phonon system by defining radiation entropy and entropy intensity of phonons.

8.2.4 Limitations of Planck’s Law The concept that a blackbody surface absorbs all radiant energy that is incident upon it is purely from the geometric-optics point of view, in which light travels in a straight line and cannot interact with an object that does not intercept the light ray. Another example of the geometric-optics viewpoint is that the transmittance of an iris (open aperture) should be 1, i.e., all the radiation incident on the opening will go through. However, for an aperture whose diameter is comparable to the wavelength of the incident radiation, diffraction may become important and, as a result, the transmittance can be less or even greater than 1. Due to the diffraction effect, a particle that is sufficiently small compared to the wavelength will interact with the radiation field, according to the scattering and absorption cross sections, which can be greater than the projected surface area. In some cases, it is possible for the object to absorb more energy than the product of the radiant flux and the projection area. The absorptance can be greater than 1 and thus exceeds the limit set by a blackbody. When such an object is placed in an isothermal enclosure, the emitted energy will be greater than that from a blackbody having the same dimensions. This anomaly has been discussed in detail by Bohren and Huffman.7 The energy density near the surface within a distance less than the wavelength can be much greater than that given by Eq. (8.41) and increases as the distance is further reduced. When two objects are placed at a distance much smaller than the characteristic wavelength of thermal radiation, i.e., in the near field, photon tunneling can occur and cause significant enhancement of the energy transfer. In recent years, there have been numerous studies of light transmission through small apertures, radiation heat transfer at nanometer distances, and light emission from nanostructures.18 This is still an open field with many new developments as well as controversies. We will study these phenomena and the underlying physics in the following two chapters. The entropy concept and the second law limitation have not been applied to the study of near-field energy transfer.

CHAPTER 8

306

8.3 RADIATIVE PROPERTIES OF SEMI-INFINITE MEDIA 8.3.1 Reflection and Refraction of a Plane Wave Consider radiation incident from one medium to another at the interface or the boundary. The boundary that separates the media is assumed to be a smooth plane and extends to infinity. Each medium is homogeneous and isotropic; so, there is no scattering within the medium. Therefore, the electric response can be characterized by the relative permittivity or dielectric function e, and the magnetic response can be characterized by the relative permeability m. For nonmagnetic materials, the refractive index is related to the dielectric function by n !e. Keep in mind that these quantities are, in general, complex and frequency dependent. The real and imaginary parts of the refractive index are often called the optical constants. In this section, we present the general formulation for both magnetic and nonmagnetic materials. For certain crystalline and amorphous solids, like quartz and glass, the refractive index is real in a wide spectral region and is the only parameter needed to fully characterize the optical response of the material. In such a case, the expression can be largely simplified and the results can be easily comprehended. The reduced results will also be presented because of their importance to numerous engineering problems. The incident radiation is a monochromatic plane wave with an angular frequency v. As shown in Fig. 8.7, the wavevector of the incident wave is k

1 (k1x,0,k1z), and the surface

(a) TE wave

(b) TM wave

FIGURE 8.7 Illustration of reflection and transmission at an interface: (a) TE wave or s polarization. (b) TM wave or p polarization.

normal defines the plane of incidence, which is the x-z plane. The wavevectors of the reflected and transmitted waves must lie in the same plane. The angle of incidence u1 is the angle between the incident wavevector and the z direction, i.e., sin u1 k1x /k1 and cos u1 k1z /k1, where k21 k21x k21z m1e1v2/c20. It is common to study the reflection and the refraction for linearly polarized waves, with either the electric or magnetic field being parallel to the y-axis, because other polarizations can be decomposed into the two polarization components. When the electric field is in the y direction, as shown in Fig. 8.7a, the wave is called a transverse-electric (TE) wave or is said to be perpendicularly (s) polarized. The incident electric field can be expressed as follows by omitting the time-harmonic term of eivt hereafter: Ei y^Ei eik1z z ik1x x

(8.63)

FUNDAMENTALS OF THERMAL RADIATION

307

The boundary conditions state that the tangential components of both E and H must be continuous at the interface. This implies that the x component of the wavevector must be the same for the incident, reflected, and transmitted waves, i.e., k1x k2x kx. Because the angle of reflection must be the same as the angle of incidence (specular reflection), we have k 1 (kx,0,k1z). For the transmitted or refracted wave, we have k2 (kx,0,k2z) and sin u2

kx n1sin u1 n k2 2

(8.64)

which is called Snell’s law. It can be easily visualized by observing the bended image of a chopstick in a bowl of water. Note that k22z k22 k2x m2e2v2/c20 k2x k22 cos2 u2. Generally speaking, the wavevector components and the refractive indices may be complex. Complex angles can be defined so that Eq. (8.64) is always valid. Near the interface, the nonzero components of the electric and magnetic fields are Ey e

Hx µ

and

(Ei eik1zz Ereik1zz)eikx x for z 0 Eteik2 zzeikx x for z 0

k1z vm m (Eieik1zz Ereik1zz)eikx x for z 0

Hz µ

0

1

k2z vm m Eteik2 z zeikx x for z 0 0

(8.65)

(8.66)

2

kx ik1z z ik1zz ikxx )e for z 0 vm0 m1 (Eie Ere kx ik2 zz ikxx for z 0 vm0 m2 Ete e

(8.67)

where Ei, Er, and Et are, respectively, the amplitudes of the incident, reflected, and transmitted electric fields at the interface. It is further assumed that kx is real so that the amplitude of the field is independent of x. The Fresnel reflection and transmission coefficients for a TE wave are defined as r12,s Er /Ei and t12,s Et /Ei, respectively. Boundary conditions require that Ey and Hx be continuous at z 0. From Eq. (8.65) and Eq. (8.66), we obtain 1 r12,s t12,s and (k1z /m1)(1 r12,s) (k2z /m2)t12,s; thus,

and

r12,s

k1z /m1 k2z /m2 Er Ei k1z /m1 k2z /m2

(8.68a)

t12,s

2k1z /m1 Et Ei k1z /m1 k2z /m2

(8.68b)

which are generally applicable, as long as each medium is homogeneous and isotropic.6 For nonmagnetic materials, the previous equations can be written as follows:

and

r12,s

n1 cos u1 n2 cos u2 n1 cos u1 n2 cos u2

(8.69a)

t12,s

2n1 cos u1 n1 cos u1 n2 cos u2

(8.69b)

CHAPTER 8

308

The directional-hemispherical spectral reflectivity, or simply reflectivity, rrl is given by the ratio of the reflected energy flux to the incident energy flux, and the directionalspectral absorptivity arl is the ratio of the transmitted energy flux to the incident energy flux, since all the photons transmitted through the interface will be absorbed inside the second medium. We use terms ending with “-ivity” only for a perfect interface and those with “-tance” for surfaces with roughness and coatings. The energy flux is related to the timeaveraged Poynting vector, defined in Eq. (8.18b). From Eq. (8.65) to Eq. (8.67), the x and z components of the Poynting vector at the interface (z S 0) in medium 1 are

and

kS1x l

k*x 1 Re c (E Er)(E*i E*r)d 2 v m0 m*1 i

(8.70a)

kS1z l

k*1z 1 (E Er)(E*i E*r)d Re c 2 vm0 m*1 i

(8.70b)

It can be seen that, in general, the reflected wave and the incident wave are coupled and the energy flow cannot be separated by a reflected flux and an incident flux. Under the assumption that medium 1 is lossless (nonabsorbing or nondissipative) and k2x k21, we can write (8.71)

kS1z l kSiz l kSrz l where

kSiz l

k1z 2 PE P 2v m0 m1 i

and

kSrz l

k1z 2 PE P 2v m0 m1 r

(8.72)

If medium 1 is lossy, there will be additional terms associated with Ei Er* and Ei*Er. In this case, the power flow normal to the interface cannot be separated as forward and backward terms because of the cross-coupling terms. Therefore, the lossless condition in medium 1 is required in order to properly define the energy reflectivity; see Salzberg (Am. J. Phys., 16, 444, 1948) and Zhang (J. Heat Transfer, 119, 645, 1997). This is usually not a problem when radiation is incident from air or a dielectric prism onto a medium. The power reflectivity can be defined based on the z components of the reflected and incident Poynting vectors; therefore, rrl,s(u1) ZEr Z 2@ZEi Z 2 Zr12,s Z 2

(8.73)

The Poynting vector at the interface in medium 2 can be written as kSt l

k*x x^ k2z* ^z 2 1 b P Et P Rea 2v m0 m*2

(8.74)

which is not parallel to Re(k2) unless Im(m2) 0. Recall that the plane of constant phase is perpendicular to Re(k2). If medium 2 is dissipative, Im(k2) is parallel to the z-axis and the amplitude will vary along the z direction. The wave becomes inhomogeneous in medium 2, except when kx 0 (normal incidence). The definition of the transmitted energy flux at the interface is based on the projected Poynting vector in the z direction. Hence, the absorptivity is the ratio of the z components of the transmitted and incident Poynting vectors, viz., arl,s(u1)

Re(k2z /m2) 2 Pt P Re(k1z /m1) 12,s

(8.75)

Note that Re(k2z /m2) Re(k2z* /m*2), and Re(k1z /m1) k1z /m1 since medium 1 is lossless. It can be shown that rrl,s arl,s 1, as required by energy conservation: kS1z l kS2z l at z 0. For nonmagnetic and nearly nondissipative materials, we have arl,s(u1)

n2 cos u2 2 t n1cos u1 P 12,s P

(8.76)

FUNDAMENTALS OF THERMAL RADIATION

309

The reflection and transmission coefficients for the transverse-magnetic (TM) wave or parallel (p) polarization are defined as the ratios of the magnetic fields: r12,p Hr /Hi and t12,p Ht /Hi, respectively. Hence, r12,p

k1z /e1 k2z /e2 Hr Hi k1z /e1 k2z /e2

(8.77a)

t12,p

2k1z /e1 Ht Hi k1z /e1 k2z /e2

(8.77b)

In the case of nonmagnetic materials, we obtain

and

r12,p

n2 cos u1 n1cos u2 n2 cos u1 n1cos u2

(8.78a)

t12,p

2n2 cos u1 n2 cos u1 n1cos u2

(8.78b)

At normal incidence, the reflection coefficients in Eq. (8.69a) and Eq. (8.78a) are related by r12,s

n1 n2 r12,p n1 n2

(8.79)

When both n1 and n2 are real and n1 n2, the electric field will experience a phase reversal (phase shift of p) upon reflection but the magnetic field will not. On the other hand, if n1 n2, it is the magnetic field that will experience a phase reversal. In fact, based on Maxwell’s equations, the electric and magnetic quantities obey a duality, when re 0, and can be interchanged with the following substitutions: E S H and H S E. Note that e and m, as well as the polarization states s and p, should also be interchanged. The Poynting vector for a TM wave is kSl Re(k/e)ZHy Z 2/(2ve0), which is not parallel to Re(k) when Im(e2) 2 0. Upon refraction into an absorbing medium, the waves become inhomogeneous and the Poynting vectors for different polarizations may split into different directions; see Halevi and Mendoza-Hernandez (J. Opt. Soc. Am., 71, 1238, 1981). Nevertheless, the constant-amplitude plane is always perpendicular to the z direction because the amplitude cannot change along the x-y plane. The reflectivity for p polarization is rrl,p(u1) Zr12,p Z 2

(8.80)

Hence, the absorptivity becomes arl,p(u1)

Re(k2z /e2) 2 Pt P Re(k1z /e1) 12,p

(8.81)

For nonmagnetic and nearly nonabsorbing materials, we have arl,p(u1)

n1 cos u2 2 t n2 cos u1 P 12,p P

(8.82)

If the incident wave is unpolarized or circularly polarized, the reflectivity can be obtained by averaging the values for p- and s-polarized waves, i.e., rrl

rrl,p rrl,s 2

(8.83)

CHAPTER 8

310

The reflectivity for radiation incident from air (n1 < 1) to a dielectric medium (n2 2) and that from the dielectric to air are shown in Fig. 8.8 for each polarization as well as for the

1

1 TE wave TM wave Average

0.6

n1 = 1 n2 = 2

0.4 0.2 0

n1 = 2

0.8 Reflectivity, r'l

Reflectivity, r'l

0.8

n2 = 1

qc = 30° Total internal reflection

0.6

TE wave TM wave Average

0.4 0.2

qB = 63.4° 0

10 20 30 40 50 60 70 80 90 Angle of incidence, q1 (deg) (a) Incident from air to a medium

qB = 26.6° 0

0

10 20 30 40 50 60 70 80 90 Angle of incidence, q1 (deg) (b) Incident from a medium to air

FIGURE 8.8 Reflectivity versus the angle of incidence between air and a dielectric.

unpolarized incident radiation. When n1 n2, the reflectance will reach 1 at u1 uc sin1(n2 /n1). This angle is called the critical angle, and total internal reflection occurs at angles of incidence greater than the critical angle. This is the principle commonly used in optical fibers and waveguides, since light is trapped inside the high-index material and propagates along the medium. It can be seen that in total internal reflection, kx k2 while k2z becomes purely imaginary. The amplitude of the wave is exponentially attenuating in the positive z direction. This is similar to Eq. (8.37) and is an evanescent wave, as shown in Fig. 8.3. The time-averaged Poynting vector is zero in the z direction. Hence, no energy is transmitted across the boundary. For the TE wave, the reflectivity increases monotonically with the angle of incidence and reaches 1 at the grazing angle (90) or at the critical angle when n1 n2. The reflectivity for the TM wave, on the other hand, goes through a minimum that is equal to zero. The angle at which rrl,p 0 is called the Brewster angle, given by uB tan1(n2/n1) for nonmagnetic materials. For p polarization, all the incident energy will be transmitted into medium 2, without reflection at the Brewster angle. This phenomenon has been used to build polarizers and transmission windows in absolute cryogenic radiometers. The physical mechanism of reflection can also be understood as the re-emission by the induced electric dipoles in the medium, based on the Ewald-Oseen extinction theorem. At the Brewster angle, the electric dipoles induced in the material align in the direction of the reflected wave, and the refracted wave is perpendicular to the reflected wave (i.e., u1 u2 90). The reflective power goes to zero because an electric dipole cannot radiate along its own axis. The situation is changed when magnetic materials are involved, such as a negative index material. The fields radiated by both the induced electric dipoles and magnetic dipoles are responsible for the reflection. The Brewster angle can occur for either polarization when the radiated fields cancel each other. A detailed discussion can be found from the publication of Fu et al.19 In an absorbing medium, there is a drop in reflectance for p polarization, but the minimum is not zero. Furthermore, there exists a principal angle at which the phase difference between the two reflection coefficients equals to 90 and the ratio of the reflectance for the TM and TE waves is minimized.6

FUNDAMENTALS OF THERMAL RADIATION

311

The reflectivity for radiation incident from air (n1 < 1) or vacuum, at normal incidence, becomes rrl,n

(n2 1)2 k22 (n2 1)2 k22

(8.84)

for any polarization. It can be seen that the normal reflectivity will be close to 1, when either n2 V 1 or n2 W 1. The reflectivity is often large for most metals in the infrared because both n2 and k2 are large, whereas the reflectivity of a conventional superconductor approaches to 1 when the frequency is lower than that of the superconducting energy gap, since n2 S 0 in this case. On the other hand, rrl,n S 0 when n2 < 1 and k2 V 1. This can occur in a dielectric material at a wavelength in the infrared and for most metals in the x-ray region.

8.3.2 Emissivity Real materials have finite thicknesses. The assumption of semi-infinity or opaqueness requires that the thickness is much greater than the radiation penetration depth. This is usually not a problem for a metal in the visible or infrared spectral regions. When this is not the case, we are dealing with a transparent or semitransparent material, like a glass window. The radiative properties of semitransparent layers and thin films will be studied in the next chapter. Laser beams or light from a spectrophotometer do not extend to infinity and are not perfectly collimated. Nevertheless, as long as the diameter of the beam spot is much greater than the wavelength and the beam divergence is not very large, the directional-spectral reflectivity and absorptivity, calculated from the previous section, are applicable to most situations. According to Kirchhoff’s law, the directional-spectral emissivity is equal to the directional-spectral absorptivity of a material.1,2 This can be shown by placing the object into an enclosure at the thermal equilibrium. When the material is not at thermal equilibrium with the surroundings, the emissivity is defined based solely on the spontaneous emission and is an intrinsic material property that does not depend on the surroundings. On the other hand, the absorptivity is defined based on the net absorbed energy by treating stimulated or induced emission as negative absorption. Under proper definitions, Kirchhoff’s law is always valid in terms of the directional-spectral properties for any given polarization.1 The only assumptions are (a) the material under consideration is at a uniform temperature, at least within several penetration depths near its surface and (b) the external field is not strong enough to alter the material’s intrinsic properties, as in a nonlinear interaction. We can then compute the directional emissivity for an opaque surface or semi-infinite media, from the directional-hemispherical reflectivity for incidence from air or vacuum, using the following relation: erl 1 rrl

(8.85)

The emissivity is commonly calculated by averaging over the two polarizations. The preceding equation can be integrated to obtain the hemispherical emissivity 2p p/2

1 el p 3 3 erl cos u sin u du df

(8.86)

0 0

It can be seen from Fig. 8.8a that, when averaged over the two polarizations, the reflectivity changes little until the Brewster angle and then increases to 1 when the incidence angle approaches 90. The hemispherical emissivity for a nonmetallic surface is about 10% smaller than the normal emissivity. On the other hand, the hemispherical emissivity for

CHAPTER 8

312

metallic surfaces is about 20% greater than the normal emissivity. Diffuse emission is a good first-order approximation, even though the surface is smooth. Thus, the hemispherical emissivity may be approximated by the normal emissivity. In most studies, the emissivity is calculated from the indirect method, based on the reflectivity and Kirchhoff’s law, discussed earlier. Direct calculations can be accomplished by considering the emission from the material, and the internal absorption and transmission. Another method is based on the fluctuation-dissipation theorem, in which the emission arises from the thermally induced fluctuating currents inside the material. The fluctuational electrodynamics is essential to the study of near-field radiation and will be discussed in detail in Chap. 10. The total-hemispherical emissivity can be evaluated using Planck’s distribution. Therefore, `

`

30 el(l)eb,l(l,T )dl

30 el(l)eb,l(l,T )dl etot

`

3 eb,l(l,T )dl

sSBT 4

(8.87)

0

The total emissivity depends on the surface temperature and the spectral dependence of the optical constants. Pure metals usually have a very low emissivity, and the emissivity increases due to surface oxidation. Spectrally selective materials that appear to be reflective to the visible light may exhibit a large total emissivity, greater than 0.9, at room temperature; examples are white paint and paper. An earlier compilation of the radiative properties of many engineering materials can be found in Touloukian and DeWitt.20 The use of surface microstructure to modify the emission characteristics will be discussed in the next chapter.

8.3.3 Bidirectional Reflectance Real surfaces contain irregularities or surface roughnesses that depend on the processing method. A surface appears to be smooth if the wavelength is much greater than the surface roughness height. A highly polished surface can have a roughness height on the order of nanometers. Some surfaces that appear “rough” to human eyes may appear to be quite “smooth” for far-infrared radiation. The reflection of radiation by rough surfaces is more complicated. For randomly rough surfaces, there often exist a peak around the direction of specular reflection, an off-specular lobe, and a diffuse component. When the surface contains periodic structures, such as patterned surfaces or micromachined surfaces, diffraction effects may become important and several peaks may appear. The bidirectional reflectance distribution function (BRDF), which is a function of the angles of incidence and reflection, fully describes the reflection characteristics from a rough surface at a given wavelength. As illustrated in Fig. 8.9, the BRDF is defined as the reflected radiance (intensity) divided by the incident irradiance (flux) at the surface, i.e., fr(l,ui,fi,ur,fr)

dIr Ii cos uidi

[sr1]

(8.88)

where (ui,fi) and (ur,fr) denote the directions of incident and reflected beams, respectively, Ii is the incident irradiance (radiant flux), and dIr is the reflected radiance (intensity). In the experiment, the detector output signal is proportional to the solid angle di. The denominator of Eq. (8.88) gives the incident radiant power reaching the detector. Hence, the BRDF can be obtained from the following measurement equation: fr

Pr 1 Pi cos urdr

(8.89)

FUNDAMENTALS OF THERMAL RADIATION

313

FIGURE 8.9 Geometry of the incident and reflected beams in defining the BRDF.

where Pi and Pr are the incident and reflected powers reaching the detector.21 The directional-hemispherical reflectance can be obtained by integrating the BRDF over the hemisphere:1,2 rrl 3 fr cos ur dr 2p

(8.90)

An important principle of the BRDF is reciprocity, which states symmetry of the BRDF, with regard to reflection and incidence angles. In other words, the reflectance for energy incident from (ui, fi) and reflected to (ur, fr) is equal to that for energy incident from (ur, fr) and reflected to (ui, fi). Therefore, fr(l,ui,fi,ur,fr) fr(l,ur,fr,ui,fi)

(8.91)

For a diffuse or Lambertian surface, the BRDF is independent of (ur, fr) and is related to the directional-hemispherical reflectance as fr,diff rrl/p. On the other hand, the BRDF for an ideal specular, or mirrorlike, reflector can be represented as fr,spec

rrl d (u ui)df(fr fi p) cos ui u r

(8.92)

where the Dirac delta function d(j) is zero everywhere, except at j 0. Furthermore, the delta functions are normalized such that 12pdu(ur ui)df(fr fi p) dr 1. These examples clearly demonstrate that the BRDF is applicable to any kind of surfaces. In the next chapter, we will study the BRDF models based on geometric optics and physical optics, as well as rigorous solutions of the Maxwell equations. We will also discuss the effect of surface microstructures on the BRDF and how to characterize a rough surface.

314

CHAPTER 8

8.4 DIELECTRIC FUNCTION MODELS Unlike in dilute gases where the molecules are far apart, in solids, the closely packed atoms form band structures. Absorption in solids usually happens in a much broader frequency region or band. Free electrons in metals can interact with the incoming electromagnetic waves or photons, and cause a broadband absorption from the visible (or even ultraviolet) all the way to the microwave and longer wavelengths. For semiconductors especially with high impurity (doping) concentrations or at elevated temperatures, both the free electrons and holes contribute to the absorption process. The absorption of a photon makes the electron or the hole to transit to a higher-energy state within the same band. Therefore, free-carrier absorption is caused by intraband transitions. In order to conserve momentum, the carriers must also collide with ionized impurities, phonons, other carriers, grain boundaries, interfaces, and so forth. The collisions act as a damping force on the motion of carriers. The Drude model describes the oscillatory movement of an electron, driven by a harmonic field, which is subjected to a damping force. The model is simple in form and predicts the dielectric function of some metals fairly well in a broad spectral region, especially in the mid- and far-infrared. Absorption by lattice vibrations or bound electrons, which is important for insulators and lightly doped semiconductors, is due to the existence of electric dipoles formed by the lattice. A maximum absorption is achieved when the frequency equals the vibrational mode of the dipole, i.e., the resonance frequency, which is usually in the mid- to far-infrared region of the spectrum. The contribution of bound electrons is often modeled by the Lorentz model. Interband transition is the fundamental absorption process in semiconductors. An electron can be excited from the valence band to the conduction band by absorbing a photon, whose energy is greater than the energy gap Eg. Because the absorption by electrons is usually weak in semiconductors, a strong absorption edge is formed near the bandgap. In this transition process, both the energy and the momentum must be conserved. This section discusses the formulation for different contributions to the dielectric function. It should be noted that the real and imaginary parts of the dielectric function are interrelated according to the causality, which is discussed first. Because all naturally occurring and most of the synthesized materials are nonmagnetic at high frequencies, only nonmagnetic materials are considered so that m 1 and n !e in the following, except in Sec. 8.4.6. 8.4.1 Kramers-Kronig Dispersion Relations The real and imaginary parts of an analytic function are related by the Hilbert transform relations. Hendrik Kramers and Ralph Kronig were the first to show that the real and imaginary parts of the dielectric function are interrelated. These relations are called the KramersKronig dispersion relations or K-K relations for simplicity. The K-K relations can be interpreted as the causality in the frequency domain and are very useful in obtaining optical constants from limited measurements. The principle of causality states that the effect cannot precede the cause, or no output before input. Some important relations are given here, and a detailed derivation and proofs can be found from Jackson,5 Born and Wolf,6 and Bohren and Huffman.7 The real part er and the imaginary part es of a dielectric function are related by

and

` zes(z) 2 dz er(v) 1 p `3 2 2 0 z v

(8.93a)

` er(z) 1 s0 2v dz es(v) e v p `3 0 z2 v2

(8.93b)

0

FUNDAMENTALS OF THERMAL RADIATION

315

where s0 is the dc conductivity, ` denotes the principal value of the integral, and z is a dummy frequency variable. These relations can be written in terms of n and k as `

zk(z) 2 dz n(v) 1 p ` 3 2 2 0 z v

(8.94a)

` n(z) 1 2v k(v) p ` 3 dz 2 2 0 z v

(8.94b)

Equation (8.93) and Eq. (8.94) are the K-K relations, which relate the real part of a causal function to an integral of its imaginary part over all frequencies, and vice versa. A number of sum rules can be derived based on the K-K relations and are useful in obtaining or validating the dielectric function of a given material. The K-K relations can be applied to reflectance spectroscopy to facilitate the determination of optical constants from the measured reflectivity of a material in vacuum.9 For radiation incident from vacuum on a material at normal incidence, the Fresnel reflection coefficient is r (v) Zr (v)Zeif(v)

1 n(v) ik(v) 1 n(v) ik(v)

(8.95)

where ZrZ is the amplitude and f the phase of the reflection coefficient. The directionalhemispherical spectral reflectivity, expressed in terms of v, is rrv(v) rr* ZrZ 2

(8.96)

The amplitude and the phase are related, and it can be shown that ` ln rr (z) v v dz f(v) p ` 3 2 2 0 z v

(8.97)

The refractive index and the extinction coefficient can be calculated, respectively, from n(v)

and

k(v)

1 rrv 1 rrv 2 cos f 2rrv 2 sin f 2rrv 1 rrv 2 cos f 2rrv

(8.98)

(8.99)

8.4.2 The Drude Model for Free Carriers The Drude model describes frequency-dependent conductivity of metals and can be extended to free carriers in semiconductors. In the absence of an electromagnetic field, free electrons move randomly. When an electromagnetic field is applied, free electrons acquire a nonzero average velocity, giving rise to an electric current that oscillates at the same frequency as the electromagnetic field. The collisions with the stationary atoms result in a damping force on the free electrons, which is proportional to their velocity. The equation of motion for a single free electron is then $ . mex megx eE (8.100)

CHAPTER 8

316

where e is the absolute charge of an electron, me is the electron mass, and g denotes the strength of the damping due to collision, i.e., the scattering rate or the inverse of the relaxation time t. Assume motion under a harmonic field E E0eivt is of the form $ the electron . x x0eivt so that x ivx. We can rewrite Eq. (8.100) as . x

e/me E iv g

. ~ (v)E; therefore, the complex conductivity is The electric current density is J neex s 2 s0 ~ (v) nee /me s g iv 1 iv/g

(8.101)

where s0 nee2t/me is the dc conductivity, as discussed in Chap. 5. Equation (8.101) is called the Drude free-electron model, which describes the frequency-dependent complex conductivity of a free-electron system in terms of the dc conductivity and the scattering rate, in a rather simple form. The electrical conductivity approaches to the dc conductivity at very low frequencies (or very long wavelengths). The dielectric function is related to the conductivity by Eq. (8.28); thus, e(v) e`

s0 g e0(v2 igv)

(8.102)

where e`, which is on the order of 1, is included to account for contributions, other than the contribution of the free electrons, that are significant at high frequencies. There exist several transitions at the ultraviolet and visible regions for metals, such as interband transitions. Note that when v S ` , the real part of the dielectric function of all materials should approach unity, as can be seen from Eq. (8.93a). In the low-frequency limit when v V g, ~ (v S 0) < s and es W er. Therefore, s 0 n Zsi sr Z 4

(9.82a)

rsp Z(hr # si)(vi # sr)rs (vr # si)(hi # sr)rp Z 2 > Zsi sr Z 4

(9.82b)

rpp Z(hr # si)(hi # sr)rs (vr # si)(vi # sr)rp Z 2 > Z si sr Z 4

(9.82d)

rps Z(vr # si)(hi # sr)rs (hr # si)(vi # sr)rp Z 2 > Zsi sr Z 4

(9.82c)

where r denotes Fresnel’s reflection coefficient. The subscripts s and p stand for each polarization. On the left-hand side, the double subscripts indicate the polarization for the incidence and the reflection, respectively. In terms of the microfacet reflectivities, the reflected energies Gr,s and Gr,p are related to the incident energies Gi,s and Gi,p by c

Gr,s r rps Gi,s d dc d c ss rsp rpp Gi,p Gr,p

(9.83)

The reflectivity is defined as the ratio of the reflected energy Gr Gr,s Gr,p to the incident energy Gi Gi,s Gi,p; thus, it depends on the polarization state of the incident wave. To facilitate the calculation, the incident energy of each ray bundle is set to unity such that (Gi,s, Gi,p ) (1, 0) for s polarization, (Gi,s, Gi,p (0, 1) for p polarization, and (Gi,s,Gi,p) (0.5,0.5) for random polarization (i.e., unpolarized incidence). For the first reflection, Gr,s and Gr,p are calculated from Eq. (9.83). For multiple reflections, the previously reflected energies are substituted for Gi,s and Gi,p, and the next reflected energy is updated according to Eq. (9.83). Each ray bundle is traced until it leaves the surface, and then, the information of its direction and energy for each polarization is stored in a database. Because the energy of the bundle is reduced after each reflection, there is no need to use random numbers to decide whether a ray bundle is reflected at the microfacet or not. In a special case, when the planes of incidence and reflection are identical, the polarization state is maintained for either s or p polarization if only the first-order scattering has

RADIATIVE PROPERTIES OF NANOMATERIALS

367

been considered. This means that the vectors hi and hr are either parallel or antiparallel (refer to Fig. 9.20); consequently, hi sr 0 and hr si 0. It can be seen from Eq. (9.82) that rsp rps0, rss 兩rs兩2, and rpp兩rp兩2. The corresponding BRDF is called the inplane BRDF (fr fi or fr fi 180). Nevertheless, the cross-polarization term is nonzero for the in-plane BRDF when multiple scattering is significant. After a large number of ray bundles have been traced, the BRDF can be calculated in terms of the energy of the ray bundles: fr(l,ui,fi,ur,fr)

Gr(ur,fr) 1 Gi(ui,fi) cosur r

(9.84)

where Gi(ui, fi) is the total energy of the incident ray bundles, and Gr(ur,fr) is the energy of the ray bundles leaving the surface within the solid angle r, in the direction (ur, fr). The integration of the BRDF yields the directional-hemispherical reflectance. The directional emittance can be obtained according to the conservation of energy and Kirchhoff’s law.

9.5.3 Surface Characterization In most studies, surface roughness is assumed to satisfy Gaussian statistics in the derivation of the BRDF model and for the surface generation in the Monte Carlo simulation. Furthermore, the roughness statistics of 2-D rough surfaces is assumed to be isotropic in most publications so that the autocorrelation function is independent of the direction. However, the Gaussian distribution may miss important features of natural and man-made rough surfaces that are strongly anisotropic. Before the invention of the AFM, the surface profile was usually measured with a mechanical profiler that scans the surface line-by-line. Some mechanical stylus profilers can measure rough surfaces with a vertical resolution of a few nanometers. However, the lateral resolution is usually on the order of 1 m due to the large radius of the stylus probe. Because the radius of curvature of the probe tip is in the range from 5 to 50 nm, an AFM can provide detailed information on the topography of a small area on the microrough surfaces, with a vertical resolution of subnanometers and a lateral resolution around 10 nm. The result is stored in an array, containing the height information, z(m, n), where m 1, 2, . . . , M and n 1, 2, . . . , N are the points along the x and y directions, respectively. To evaluate the 2-D slope distribution p(zx, zy), each surface element is determined by the four closest nodes in the data array. The four-node element can be considered as two triangular surfaces with a common side. The surface normals for the two triangles can be averaged to give the mean slope of the surface element such that zm 1,n zm,n zm 1,n 1 zm,n 1

2l 2l zm,n 1 zm,n zm 1,n 1 zm 1,n zy

2l 2l zx

(9.85a) (9.85b)

where l is the lateral distance between adjacent data points.26 The SDF can be determined by evaluating the slopes of all measured surface elements. For a scan area of 100 100 m2, the lateral interval l ⬇ 0.2 m, when the data are stored in a 512 512 array. The 2-D SDFs from the AFM measurement in the tapping mode, for two lightly doped 100 single-crystal silicon surfaces, are shown in Fig. 9.21.27 In the contact mode, lateral or shear forces can distort surface features and reduce the spatial resolution. Thus, deep valleys may not be correctly measured. The AFM scanning performed in the tapping mode with sharper silicon tips allows measuring precipitous slopes. The two SDFs are non-Gaussian and anisotropic, although the anisotropy of Si-1 is not as striking as that of Si-2. The SDF of Si-1

368

CHAPTER 9

FIGURE 9.21 2-D slope distribution obtained from AFM topographic measurements for two samples: (a) Si-1; (b) Si-2.

contains only one dominant peak at the center, indicating that a large number of microfacets are only slightly tilted. The SDF of Si-2 also has a dominant peak at the center, though smaller than that of Si-1. Four side peaks can also be seen that are nearly symmetric. These side peaks are associated with the formation of {311} planes, during the chemical etching in the (100) crystalline wafer.26,27 The angle between the (100) plane and any of the four (311) planes is cos1(3/ 211) 25.2, which is close to the location of the observed side peaks.

9.5.4 BRDF Measurements The BRDF of silicon wafers was measured with a laser scatterometer, named as three-axis automated scatterometer (TAAS), shown schematically in Fig. 9.22.30 The sample is vertically mounted. Three rotary stages, automatically controlled by a computer, are used to change incidence and reflection directions. One rotates the sample around the y-axis to change the incidence angle ui, another rotates detector A in the x-z (horizontal) plane to change the reflection angle ur, and the third rotates the arm of detector A out of the x-z plane to change the azimuthal angle fr for out-of-plane measurements. Manual rotation of the sample on a sample holder around the z-axis adjusts the azimuthal angle fi. The incident laser beam is parallel to the optical table (x-z plane). A diode laser system serves as an optical source, and a lock-in amplifier, connected with a diode laser controller, modulates the output optical power at 400 Hz. The wavelength can be selected by replacing the fiber-coupled diode laser, and a number of diode lasers in the visible and the near-infrared are available. The diode laser is mounted on a thermoelectrically controlled stage to provide power stability within a standard deviation of 0.2%. An optical fiber is used to provide flexibility for optical access and alignment. The light from the output end of the fiber is in the horizontal plane. As shown in Fig. 9.22, the beam first passes through a collimator with a pair of lenses and a small aperture. A linear polarizer mounted on a dial allows the selection of polarization for light incident on the sample. The beamsplitter then divides the laser beam into two passes: one goes to the sample and the other to a stationary reference detector B. The light scattered by the sample is measured by detector A. The beam spot size on the sample is a few millimeters in diameter, and the measurement can be considered as a spatial average over the beam diameter. Si and Ge photodiode detectors measure the radiant power in the wavelength range from 350 to 1100 nm and from 800 to 1800 nm, respectively. The power collected at each detector is sent to a trans-impedance preamplifier that has nine decades of amplification range. The preamplifier has a linear frequency response from dc (zero frequency) up to a certain maximum frequency that is much greater than 400 Hz. The lock-in

RADIATIVE PROPERTIES OF NANOMATERIALS

369

FIGURE 9.22 Schematic of the three-axis automated scatterometer (TAAS) for BRDF measurements.

amplifier picks up only the phase-locked signals at 400 Hz, thereby eliminating the effect of background radiation or stray light without using a chopper. The measurement equation for the BRDF is given by fr(ui,fi,ur,fr) CI

VA VB cos ur r

(9.86)

where VA and VB are the outputs of detectors A and B, respectively, and r is the reflection solid angle, which is 1.84 104 sr, as determined by the area of a precision-machined aperture in front of the detector and the distance between this aperture and the beam spot on the sample. An instrument constant CI compensates the beamsplitter ratio and the difference in the responsivities of the two detectors. The BRDF within 2.5° of the retroreflection direction (ur ui and fr fi) cannot be measured since the movable detector blocks the incident beam. A PC performs the data acquisition and automatic rotary-stage control in a LabView environment. In the measurements, VA and VB are averaged over many measurements at a given position to reduce the random error. The relative uncertainty of the TAAS is estimated to be 5% for fr 0.1 through intercomparison with a reference standard instrument at NIST.30

CHAPTER 9

370

9.5.5 Comparison of Modeling with Measurements Figure 9.23 compares the predicted BRDFs based on the slope distribution with the BRDFs measured using TAAS at l 635 nm, for Si-2, which is strongly anisotropic.27 For clarity, 0.5

0.5 (a) qi = 0°, f i = 0°

frcosθr (sr−1)

0.3 0.2 0.1 0.0 −90

0.3 0.2 0.1

−60

−30 0 30 60 Observation angle, qobs (deg)

90

0.0

−30

0

30

60

90

60

90

(d) q i = 45°, f i = 45° 0.5 frcosqr (sr−1)

frcosqr (sr−1)

−60

0.6 (c) q i = 45°, f i = 0°

0.4 0.3 0.2 0.1 0.0 −90

−90

Observation angle, qobs (deg)

0.6 0.5

Si-2

0.4

Monte Carlo (MSM)

0.4 frcosqr (sr−1)

(b) q i = 0°, f i = 45°

TAAS data

0.4 0.3 0.2 0.1

−60

−30

0

30

60

Observation angle, qobs (deg)

90

0.0 −90

−60

−30

0

30

Observation angle, qobs (deg)

FIGURE 9.23 Comparison of Monte Carlo model based on the MSM and the measured in-plane BRDF for Si-2. The observation angle uobs is the same as the reflection polar angle when f r f i p and negative refraction polar angle when fr fi.27

only the prediction using the MSM is presented. The predictions with the SGM and the analytical model yield a similar agreement with experiments.26,27 As can be seen from Fig. 9.23a, the prediction and the measurement agree well, except near uobs 0, where the measurements can not be taken within 2.5° and the simulation has a large fluctuation. The simulation captures the general features and trends of the measured BRDF, while some discrepancies exist near the side peaks. For ui 0 and fi 45, as shown in Fig. 9.23b, the BRDF contains two large side peaks associated with the side peaks in the SDF for Si-2 at Zzx Z < Zzy Z < 0.38 in Fig. 9.21b. The Monte Carlo simulations also predict the side peaks located approximately at ur 57, which deviates somewhat from the measured value of 50. Based on Snell’s law, the inclination angle of microfacets is half of ur, at ui 0. Therefore, the measured side peaks in the BRDF correspond to an inclination angle 25, which is very close to the angle of 25.2 between any of the four {311} planes and the (100) plane. On the other hand, the predicted side peaks correspond to an inclination angle of 28.5°, which is almost the same as that calculated from the slope at Zzx Z Zzy Z 0.38. Consequently, the side peak position obtained from the BRDF measurement is more reliable than that predicted

RADIATIVE PROPERTIES OF NANOMATERIALS

371

by the Monte Carlo methods using the topographic data from the AFM measurement. Due to the artifacts in the AFM measurements, the BRDF values are underpredicted when 15 ur 50 and overpredicted when 50 ur 80. When, ui 45 the Monte Carlo method overpredicts the specular peak, presumably due to the limitation of geometric optics. The disagreement between the predicted and measured BRDFs, for 60 uobs 85, may be due to the combined result of the artifacts in the AFM measurement, the limitation of the GOA, and multiple scattering. For ui 45 and fi 45, a small side peak appears at uobs 60 in the measured curve and at uobs 71 in the predicted curve. This is believed to be due to microfacets with {111} orientation that have an inclination angle of 54.7°. The small side peak should occur around uobs 64.4 based on simple geometric arguments. Figure 9.24 shows the directional-spectral emittance measured using an integrating sphere coupled with a monochromator.31 The directional emittance was calculated from the 0.75 (a) Si-1

0.70 Emittance, e'l

Emittance, e'l

0.70

0.75

0.65 0.60

Anisotropic model Gaussian model

0.55 0.50 400

600

700

800

Wavelength, l (nm)

0.65 0.60

Anisotropic model Gaussian model

0.55

Experiment

500

(b) Si-2

900

1000

0.50 400

Experiment 500

600

700

800

900

1000

Wavelength, l (nm)

FIGURE 9.24 Comparison of the predicted and measured emittance of Si-1 and Si-2, in a polar angle approximately equal to 7.31

measured directional-hemispherical reflectance at an incidence angle of approximately 7. The emittance values calculated from the models based on Gaussian distribution and anisotropic slope distribution are compared with those obtained from experiments. For Si-1, which is nearly isotropic, the difference between the models is small and the agreement with the experiment is excellent. The combined uncertainty in the measurement is estimated to be 0.01, except at l 1000 nm, where the silicon wafer becomes slightly transparent. For Si-2, however, the Gaussian model underpredicts the emittance and there is a large enhancement of the emittance due to anisotropy. The Monte Carlo model, based on the MSM, significantly improves the prediction. Given the fact that the AFM surface topographic measurements may not perfectly match the actual surface slope distribution, an uncertainty of 0.01 has been estimated for the Monte Carlo model. It can be seen that the prediction agrees with the measurement better at short wavelengths, where geometric optics is more suitable. The out-of-plane BRDFs of Si-1 and Si-2, calculated with the MSM at ui 30, are presented in Fig. 9.25 as contour plots in a polar coordinates system.27 In these plots, the radial and azimuthal coordinates respectively correspond to ur and fr, and the z-axis represents fr cosur. The BRDFs depend little on fr around the specular direction, but the dependence becomes large as the angular separation from the specular peak increases. The region where the BRDF is independent of fr is broader for Si-1 than for Si-2. The predicted BRDFs for Si-2 display a strong specular reflection peak, together with the four large side peaks associated with {311} planes. In addition, a small side peak associated with a {111} plane appears at large ur, as illustrated in Fig. 9.25c at fr 294 and another in Fig. 9.25d at fr 45. The

372

CHAPTER 9

FIGURE 9.25 BRDF predicted by the MSM at ui 30 for random polarization.27 (a) Si-1 at fi 0. (b) Si-1 at fi 45. (c) Si-2 at fi 0. (d) Si-2 at fi 45. In the polar contour plots, the radial coordinate corresponds to ur , and the azimuthal coordinate corresponds to fr.

actual magnitudes of the small side peaks may be smaller than those predicted by the MSM, and their positions may shift toward smaller ur. Nevertheless, Fig. 9.25 indicates that the Monte Carlo method is an effective technique for studying the BRDFs for anisotropic surfaces.

9.6 SUMMARY This chapter provided a detailed treatment of the radiative properties of stratified media based on the electromagnetic wave theory, considering partial coherence, and extended to the discussion of periodic structures, i.e., photonic crystals. A discussion of the coupledwave analysis was also present for periodic gratings. Moreover, a survey was given to some recent research on the bidirectional reflectance of anisotropic surfaces. An important area that was not covered is light scattering by small particles and colloids for which there have been tremendous interests and extensive studies. The heat transfer community is very familiar with light scattering and radiative transfer in scattering and absorbing media. Recently, more and more studies on light scattering have employed rigorous treatments of the electromagnetic wave scattering in random media, considering polarization and surface plasmon resonance. Further discussions on evanescent waves, surface waves including surface plasmon and phonon polaritons, and near-field energy transfer by electromagnetic waves will be given in the next chapter.

RADIATIVE PROPERTIES OF NANOMATERIALS

373

REFERENCES 1. R. Siegel and J. R. Howell, Thermal Radiation Heat Transfer, 4th ed., Taylor & Francis, New York, 2002. 2. E. D. Palik (ed.), Handbook of the Optical Constants of Solids, Vols. I, II, and III, Academic Press, San Diego, CA, 1998. 3. O. S. Heavens, Optical Properties of Thin Solid Films, Dover Publications, New York, 1965. 4. Z. Knittl, Optics of Thin Films, Wiley, New York, 1976. 5. M. Q. Brewster, Thermal Radiative Transfer and Properties, Wiley, New York, 1992. 6. Z. M. Zhang, “Reexamination of the transmittance formulae of a lamina,” J. Heat Transfer, 119, 645–647, 1997; Z. M. Zhang, “Optical properties of a slightly absorbing film for oblique incidence,” Appl. Opt., 38, 205–207, 1999. 7. L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, Cambridge, UK, 1995. 8. G. Chen and C. L. Tien, “Partial coherence theory of thin film radiative properties,” J. Heat Transfer, 114, 636–643, 1992. 9. K. Fu, P.-f., Hsu, and Z. M. Zhang, “Unified analytical formulation of thin-film radiative properties including partial coherence,” Appl. Opt., 45, 653–661, 2006. 10. B. J. Lee, V. P. Khuu, and Z. M. Zhang, “Partially coherent spectral radiative properties of dielectric thin films with rough surfaces,” J. Thermophys. Heat Transfer, 19, 360–366, 2005. 11. J. M. Vaughan, The Fabry-Perot Interferometer: History, Theory, Practice and Applications, Adam Hilger, Bristol, PA, 1989. 12. A. R. Kumar, V. A. Boychev, Z. M. Zhang, and D. B. Tanner, “Fabry-Perot resonators built with YBa2Cu3O7-d films on Si substrates,” J. Heat Transfer, 122, 785–791, 2000. 13. P. Yeh, Optical Waves in Layered Media, Wiley, New York, 1988; P. Yeh, A. Yariv, and C. S. Hong, “Electromagnetic propagation in periodic stratified media. I. General theory,” J. Opt. Soc. Am., 67, 423–438, 1977. 14. C. L. Mitsas and D. I. Siapkas, “Generalized matrix method for analysis of coherence and incoherent reflectance and transmittance of multilayer structures with rough surfaces, interfaces, and finite substrates.” Appl. Opt., 34, 1678–1683, 1995. 15. B. J. Lee and Z. M. Zhang, “Rad-Pro: effective software for modeling radiative properties in rapid thermal processing,” in Proc. 13th IEEE Annu. Int. Conf. Adv. Thermal Processing of Semiconductors (RTP’2005), pp. 275–281, Santa Barbara, CA, October 4–7, 2005. 16. J. D. Joannopoulos, R. D. Meade, and J. N. Winn, Photonic Crystals, Princeton University Press, Princeton, NJ, 1995. 17. K. Sakoda, Optical Properties of Photonic Crystals, Springer-Verlag, Berlin, 2001. 18. J. G. Fleming, S. Y. Lin, I. El-Kady, R. Biswas, and K. M. Ho, “All-metallic three-dimensional photonic crystals with a large infrared bandgap,” Nature, 417, 52–55, 2002; C. H. Seager, M. B. Sinclair, and J. G. Fleming, “Accurate measurements of thermal radiation from a tungsten photonic lattice,” Appl. Phys. Lett., 86, 244105, 2005. 19. H. A. Macleod, Thin Film Optical Filters, 3rd ed., Institute of Physics, Bristol, UK, 2001. 20. D. Maystre (ed.), Selected Papers on Diffraction Gratings, SPIE Milestone Series 83, The International Society for Optical Engineering, Bellingham, WA, 1993. 21. R. Petit (ed.), Electromagnetic Theory of Gratings, Springer, Berlin, 1980. 22. M. G. Moharam, E. B. Grann, D. A. Pommet, and T. K. Gaylord, “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A, 12, 1068–1076, 1995; M. G. Moharam, D. A. Pommet, E. B. Grann, and T. K. Gaylord, “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: Enhanced transmittance matrix approach,” J. Opt. Soc. Am. A, 12, 1077–1086, 1995. 23. L. F. Li, “Use of Fourier series in the analysis of discontinuous periodic structures,” J. Opt. Soc. Am. A, 13, 1870–1876, 1996. 24. Y. B. Chen, Z. M. Zhang, and P. J. Timans, “Radiative properties of pattered wafers with nanoscale linewidth,” J. Heat Transfer, 129, 79–90, 2007.

CHAPTER 9

374

25. P. Beckmann and A. Spizzichino, The Scattering of Electromagnetic Waves from Rough Surfaces, Artech House, Norwood, MA, 1987. 26. Q. Z. Zhu and Z. M. Zhang, “Anisotropic slope distribution and bidirectional reflectance of a rough silicon surface,” J. Heat Transfer, 126, 985–993, 2004; Q. Z. Zhu and Z. M. Zhang, “Correlation of angle-resolved light scattering with the microfacet orientation of rough silicon surfaces,” Opt. Eng., 44, 073601, 2005. 27. H. J. Lee, Y. B. Chen, and Z. M. Zhang, “Directional radiative properties of anisotropic rough silicon and gold surfaces,” Int. J. Heat Mass Transfer, 49, 4482–4495, 2006. 28. Y. H. Zhou and Z. M. Zhang, “Radiative properties of semitransparent silicon wafers with rough surfaces,” J. Heat Transfer, 125, 462–470, 2003; H. J. Lee, B. J. Lee, and Z. M. Zhang, “Modeling the radiative properties of semitransparent wafers with rough surfaces and thin-film coatings,” J. Quant. Spectros. Radiat. Transfer, 93, 185–194, 2005. 29. Q. Z. Zhu, H. J. Lee, and Z. M. Zhang, “Validity of hybrid models for the bidirectional reflectance of coated rough surfaces,” J. Thermophys. Heat Transfer, 19, 548–557, 2005. 30. Y. J. Shen, Q. Z. Zhu, and Z. M. Zhang, “A scatterometer for measuring the bidirectional reflectance and transmittance of semiconductor wafers with rough surfaces,” Rev. Sci. Instrum., 74, 4885–4892, 2003. 31. H. J. Lee, A. C. Bryson, and Z. M. Zhang, “Measurement and modeling of the emittance of silicon wafers with anisotropic roughness,” Proc. 16th Symp. Thermophys. Properties, Boulder, CO, July 30–August 4, 2006.

PROBLEMS A greenhouse looks like a small glass house used to grow plants in the winter. Based on the transmittance curve of fused silica (SiO2), shown in Fig. 9.2, explain why glass walls can keep the plants warm in the winter. Discuss the greenhouse effect in the atmosphere. What gases are responsible for the greenhouse effect? 9.2. Calculate the transmittance T, the reflectance R, and the absorptance A of a thick (without considering interference) silicon wafer (0.5 mm thick) at normal incidence. Plot T, R, and A versus wavelength, in the range from 2.5 to 25 m. The refractive index and the extinction coefficient of the doped silicon are given in the following table: 9.1.

Optical Constants of a Doped Silicon Wafer Wavelength l (m) 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0

Refractive index n 3.44 3.43 3.42 3.42 3.42 3.42 3.42 3.42 3.42 3.42

Extinction coefficient k 0 1.0 107 8.4 105 2.1 104 4.0 104 5.0 104 9.0 104 1.0 103 1.1 103 1.3 103

Calculate and plot the transmittance and reflectance for the same silicon wafer described in Problem 9.2 at l 5 m as functions of the polar angle u. Consider the individual polarizations and their average. Compare your results with those by Zhang et al. (Infrared Phys. Technol., 37, 539, 1996). 9.4. Using data from the table in Problem 9.2, calculate and plot the normal transmittance of a 100-m-thick silicon wafer, near 10-m wavelength, considering interference. 9.3.

RADIATIVE PROPERTIES OF NANOMATERIALS

375

(a) Plot the transmittance in terms of wavelength (m) with an interval between the data spacing of 0.05 and 0.005 m, respectively, on one graph. (b) Plot the transmittance in terms of wavenumber (cm–1) with an interval between the data spacing of 5 and 0.5 cm–1, respectively, on one graph. (c) What is the fringe-averaged transmittance at 10-m wavelength? (d) What is the free spectral range in wavenumber and in wavelength? How will n and l change if the wavelength l is changed to 20 m? 9.5. For gold, the refractive index at l 0.5 m is n 0.916 i1.84, and at l 2.0 m is n 0.85 i12.6. Calculate the transmittance of a free-standing gold film at these wavelengths for d 10, 20, 50, and 100 nm, using both Eq. (9.10) and Eq. (9.11). Which equation gives the correct results, and why? 9.6. For the three-layer structure shown in Fig. 9.3, calculate the normal reflectance for n1 1.45 (glass), n2 1 (air gap), and n3 2 (substrate) without any absorption at l 1 m. Plot the reflectance as a function of the air-gap width d. Obtain the analytical formulae of the reflectance maximum and minimum. 9.7. Assume that glass has a refractive index of 1.46 without any absorption in the visible spectrum (0.4 m l 0.7m). Design an antireflection coating (for normal incidence) that will minimize the reflectance from a semi-infinite glass. You need to determine the coating thickness and the refractive index (assuming it is independent of wavelength). Plot the normal reflectance of the coated glass surface in the spectral range from 0.4 to 0.7 m. What material would you recommend for use with the desired property? 9.8. To evaluate the effect of antireflection coating for oblique incidence, assume the antireflection coating has a refractive index of 1.21 and a thickness of 114 nm. What will be the reflectance, at 45 and 60, for each polarization? 9.9. While the extinction coefficient is often related to absorption or loss, it should be noted that when k W n, it is the real part of the refractive index that is related to the loss. This is because the dielectric function can be expressed as e er ies (n2 k2) i2nk, where e is related to the dissipation. For a semi-infinite medium, a purely negative dielectric function means perfect reflection. The effect of n on the absorption by a thin film can be studied by considering a thin film of thickness d with a complex refractive index n2 n ik. For a wavelength of l 0.5 m and at normal incidence, let d 30 nm and k 3.0. Plot the transmittance, the reflectance, and the emittance (which is the same as the absorptance), against the refractive index n ranging from 0.01 to 2. Discuss the effect of n on the absorption. 9.10. Use the dielectric function of SiC given in Example 8-7 to calculate the normal emittance for a SiC film at wavelengths from 9 to 15 m, for different film thicknesses: d 1, 10, 100, and 1000 m. Assume the multiply reflected waves to be perfectly coherent. 9.11. Calculate the emittance as a function of the emission angle for a doped silicon wafer of 200-m thickness, at l 20 m with n2 3.42 i0.001. Consider p and s polarizations separately, and then, take an average. Assume the multiply reflected waves to be perfectly coherent. 9.12. This problem concerns the transmission and reflection of infrared radiation of a YBCO (YBa2Cu3O7) film on a thin MgO substrate of 325-m thickness, at 300 K and normal incidence. For the YBCO film, use the properties for sample A from Kumar et al. (J. Heat Transfer, 121, 844, 1999). For MgO, use the Lorentz model in Problem 8.26. (a) Plot the radiation penetration depth of the YBCO film, df (l), and that of MgO, ds (l), for 1 m l 1000 m. (b) Neglecting the interference effect in the MgO substrate, calculate and plot the transmittance T, the film-side reflectance Rf, and the back-side reflectance Rs, for 1m l 1000 m, with different film thicknesses: 0, 30, 48, 70, and 400 nm. Plot T, Rf, and Rs in terms of both wavelength (m) and wavenumber (cm1). (c) Repeat the previous calculation, considering the interference effects in the MgO substrate, for 200 m l 1000 m (50 to 10 cm1). Plot in terms of the wavenumber only. What happens with the interference fringes when the film thickness is 48 nm? 9.13. Calculate the normal transmittance of a 10-m film with a refractive index n 2.4 without any absorption in the spectral range from 1000 to 3000 cm1. One surface of the film is polished, and the other surface has a roughness srms of 0.10 m. How does the srms value affect the transmittance? Compare your result with that shown in Fig. 9.10.

376

CHAPTER 9

9.14. Reproduce Example 9-2 and Fig. 9.10. Suppose the coherence spectral width dn 1.5 n,

where n is the free spectral range. Determine the fringe-averaged transmittance. Explain why the peaks and the valleys flip after fringe averaging. 9.15. Calculate and plot the transmittance of a Fabry-Perot resonance cavity, assuming the medium to be lossless with n2 2, d2 100 m, and R 0.9, for normal incidence in the wavenumber region from 950 to 1050 cm1. What are the free spectral range, the FWHM of the peak, and the Q-factor of the resonator? Does the theoretically predicted FWHM match with the plot? 9.16. Group project: A reflectance Fabry-Perot cavity can be constructed by coating a SiO2 film onto a silver substrate first and then a thin silver film onto the SiO2 film. Derive a formula for the reflectance. Based on Kirchhoff’s law, one can calculate the emissivity of the structure. Show that the emissivity exhibits sharp peaks close to unity at specific wavelengths for normal incidence. When the wavelength is fixed, calculate the emissivity versus the polar angle for each polarization. Plot and show that there exist angular lobes in the emissivity of such structures. Hint: Choose the thicknesses of the silver film (on the order of 100 nm) and the SiO2 film (on the order of 3000 nm), and the wavelength (around 1 m). Use the optical constants from Palik.2 9.17. Group project: Develop a Matlab code for the multilayer radiative properties based on the matrix formulation described in the text for both TE and TM waves. Compare your results with those calculated by using Rad-Pro, downloadable from www.me.gatech.edu/~zzhang. 9.18. Group project: Evaluate and plot the band structures of a Bragg reflector made of quarter-wave high- and low-index materials GaAs, n 3.49, and AlAs, n 2.95, around the wavelength of 1064 nm. Optional: Plot the normal reflectance near 1064-nm wavelength with 7, 17, and 27 periods, assuming that the substrate is GaAs. 9.19. Derive Eq. (9.60) and Eq. (9.61). 9.20. Based on Eq. (9.64), show that when the evanescent wave exists, it will decay toward negative z. Change the subscript from I to III, and show that when the evanescent wave exists, it will decay toward positive z. 9.21. Derive Eq. (9.71) and Eq. (9.72). 9.22. Use different effective medium formulations to compute the effective dielectric function for silicon with a filling ratio f 1/16 in air at l 300 nm (n 5.0 and k 4.2), l 400 nm (n 5.6 and k 0.39), l 500 nm (n 4.3 and k 0.073), and l 800 nm (n 3.7 and k 0.0066). 9.23. Consider a grating region consisting of Si, with a filling ratio of 1/6, on a semi-infinite Si substrate. The height of the grating is 50 nm. Calculate the reflectance for normal incidence, using different effective medium formulations at the corresponding wavelengths given in Problem 9.22. Compare your results with those in Fig. 9.19. 9.24. Plot the shadowing function for a Gaussian distribution as a function of the polar angle u for the rms slopes w 0.05, 0.1, 0.2, and 0.3. 9.25. Calculate the BRDFs at l 0.5 and 2 m based on the analytical model for a gold surface (opaque) with a Gaussian roughness statistics. The SDF is given by p(zx ,zy )

zx2 zy2 1 b exp a 2pw 2w2

Use the optical constants from Problem 9.5 and the rms slope w 0.1 and 0.3. 9.26. Comment on the limitations of different analytical models for the BRDF, such as the Rayleigh-

Rice perturbation theory, the Kirchhoff approximation, and the geometric optics approximation.

CHAPTER 10

NEAR-FIELD ENERGY TRANSFER

Near-field optics has played a significant role in nanoscience and nanobiotechnology in the past 20 years and continues to be an active research area, especially when dealing with field localization and resonances in micro/nanostructures, with applications in biochemical sensing and nanolithography. The preceding two chapters have laid the foundation of electromagnetic waves in bulk materials and nanostructures. The present chapter offers a more detailed treatment of the energy transfer by electromagnetic waves in the near field, as well as the coupling between near-field phenomena and far-field characteristics. The applications include nanomanufacturing, energy conversion systems, and nanoelectronics thermal management. Ernst Abbe in 1873 and Lord Rayleigh in 1879 studied the required angular separation between two objects for their images to be resolved. The resolution of a conventional microscope is diffraction limited such that the smallest resolvable distance is approximately 0.5l/n, where l is the wavelength in vacuum and n is the refractive index of the medium. Even with an immersion oil (n < 1.5), the imaging sharpness is rather limited to the order of wavelength. The concept of near-field imaging was first described by Synge (Phil. Mag., 6, 356, 1928). This work elaborated the concept of using subwavelength aperture as small as 10 nm in diameter to introduce light to a specimen (e.g., a stained biological section), placed within 10-nm distance, which could move in its plane with a step size less than 10 nm. By measuring the transmitted light with a photoelectric cell and a microscope, an ultramicroscopic image could be constructed. In a subsequent paper (Phil. Mag., 13, 297, 1932), Synge described the idea of using piezoelectricity in microscopy. Synge’s works, however, were largely unnoticed and the idea of near-field imaging was rediscovered many years later. Ash and Nicholls published a paper (Nature, 237, 510, 1972) entitled “Super-resolution aperture scanning microscope.” This work experimentally demonstrated near-field imaging with a resolution of l/60 using 10-GHz microwave radiation ( 3 cm). In the 1980s, two groups have successfully developed near-field microscopes in the visible region.1,2 The IBM group in Zurich formed the aperture through a quartz tip coated with a metallic film on its sides,1 whereas the Cornell group used silicon microfabrication to form the aperture.2 The fabrication process was later improved by using metal-coated tapered optical fibers. In the early 1990s, Betzig at Bell Labs and collaborators demonstrated single molecule detection and data storage capability of 45 gigabits per square inch.3 Nowadays, near-field scanning optical microscope (NSOM), also known as scanning near-field optical microscope (SNOM), has become a powerful tool in the study of fundamental space- and time-dependent processes, thermal metrology, and optical manufacturing with a spatial resolution of less than 50 nm. NSOM is usually combined with the atomic force microscope (AFM) for highly controllable movement and position sensing. An alternative approach is to use a metallic AFM tip to couple the far-field radiation with the near-field electromagnetic waves in a subwavelength region underneath the tip. This is 377 Copyright © 2007 by The McGraw-Hill Companies, Inc. Click here for terms of use.

CHAPTER 10

378

the so called apertureless NSOM, which does not require an optical fiber or an aperture. Apertureless tips allow high-intensity laser energy to be focused to nanoscale dimensions for laser-assisted nanothermal manufacturing.4,5 Figure 10.1 illustrates three typical NSOM designs. The first is an aperture-based setup, where a very small opening is formed on an opaque plate and collimated light is incident

Laser ªl

(a) Aperture

ªl

(b) Fiber tip

ªl

(c) Apertureless

FIGURE 10.1 Schematic illustration of different NSOM setups. (a) Aperture on an opaque plate. (b) Aperture at the end of a coated optical fiber. (c) Apertureless metallic tip. The opening or the tip is much smaller than the wavelength . The electric field is highly collimated in the near field within a distance of and diverges as the distance increases.

from the above. The second is based on a tapered optical fiber whose tip serves as an aperture. The third uses an apertureless metallic sharp tip, which reflects (scatters) the incident laser light. All of the three designs have one thing in common. The light is confined to a narrow region whose width may be much less than a wavelength. Furthermore, the electromagnetic field within one wavelength distance is very intense and highly collimated. In the near-field region, evanescent waves dominate. Because the amplitude of an evanescent wave decays exponentially away from the aperture or tip, the far-field, or the radiation field diverges and becomes very weak. Understanding the nature of evanescent waves and the localized fields is essential for the NSOM and other near-field optical devices. Evanescent waves are also essential in energy transfer between adjacent objects, through photon tunneling, and in surface plasmon polaritons or surface phonon polaritons. Polaritons are elementary excitons in solids due to charge oscillations near the interface and can interact strongly with electromagnetic waves. In this chapter, we will first use total internal reflection to introduce evanescent waves, and then discuss polaritons or electromagnetic surface waves. The application to construct coherent thermal emission sources and radiation heat transfer at nanometer distances will be presented afterward.

10.1 TOTAL INTERNAL REFLECTION, GUIDED WAVES, AND PHOTON TUNNELING Total internal reflection occurs when light comes from an optically denser material to another material at incidence angles greater than the critical angle determined by Snell’s law. As discussed in Chap. 8, the amplitude of the reflection coefficient becomes unity at incidence angles greater than the critical angle. Although no energy is transferred from medium 1 to medium 2, there exists an electromagnetic field in the second medium near the surface. This electromagnetic field can store as well as exchange energy with medium

NEAR-FIELD ENERGY TRANSFER

379

1 at any instant of time. The time-averaged energy flux must be zero across the interface. Total internal reflection has important applications in optical fibers and waveguides. When medium 2 is not infinitely extended but a very thin layer sandwiched between the first medium and the third medium (which may be made of the same material as that of medium 1), photons can tunnel through the second medium into the third, even though the angle of incidence is greater than the critical angle. This phenomenon is called photon tunneling, radiation tunneling, or frustrated total internal reflection, and has been studied for over 300 years since Newton’s time. Detailed descriptions of the original experiments and analyses by Isaac Newton can be found from his classical book, Opticks (reprinted by Dover Publications in 1952). The enhanced energy transfer by photon tunneling may have applications in thermophotovoltaic energy conversion devices as well as nanothermal manufacturing using heated AFM cantilever tips. 10.1.1 The Goos-Hänchen Shift Evanescent waves can be illustrated by using the total internal reflection arrangement. Consider a plane wave of angular frequency v incident from a semi-infinite medium 1 to ^ ^ ^ ^ medium 2, as shown in Fig. 10.2a. The wavevector k

1 kxx k1zz, k1 kxx k1zz, and

k1+

k1−

q1

e1, m1 e2, m2

a

x q2

k2 kzz

z (a)

Hy

(b)

FIGURE 10.2 Illustration of total internal reflection. (a) Schematic of the incident, reflected, and transmitted waves at the interface between two semi-infinite media. (b) The magnetic field distribution for a TM wave when total internal reflection occurs.

k2 kxx^ k2z^z, since the parallel wavevector component kx must be the same as required by the phase-matching boundary condition. The magnitudes of the wavevectors are

and

k21 k2x k21z e1m1v2/c2

(10.1a)

k22 k2x k22z e2m2v2/c2

(10.1b)

where e and m are the relative (ratio to those of vacuum) permittivity and permeability, respectively, and c is the speed of light in vacuum (throughout this chapter). Let us assume

CHAPTER 10

380

that the incident wave is p polarized or a TM wave, so that the only nonzero component of the magnetic field is in the y direction. The magnetic field of the incident wave may be expressed as Hi (0, Hy, 0), where Hy(x, y, z, t) Hieik1zz ik1xxivt. For simplicity, let us omit exp(ivt) from now on. Recall that the Fresnel coefficients for a TM wave are defined as the ratios of the reflected or transmitted magnetic field to the incident magnetic field. For example, the Fresnel reflection coefficient is rp

k1z /e1 k2z /e2 Hr Hi k1z /e1 k2z /e2

(10.2)

The field in medium 1 is composed of the incident and reflected fields, and that in medium 2 is the transmitted field. Therefore, Hy Hi

e

(eik1zz rpeik1zz)eikxx for z 0 (1 rp)eik2zzeikxx for z 0

(10.3)

The electric fields can be obtained by applying the Maxwell equations. Similar to Sec. 8.3.1, we can write the electric and magnetic fields in both media as follows:

and

k1z ik1zz ik1zz ikxx )e for z 0 ve1e0 (e rpe Ex µ k2z Hi ik2zz ikxx for z 0 ve2e0 (1 rp)e e Ez µ Hi

kx ve e (eik1zz rpeik1zz)eikxx for z 0 1 0

kx ve e (1 rp)eik2zzeikxx for z 0

(10.4)

(10.5)

2 0

Assume that e’s and m’s are real and furthermore, e1m1 e2m2 0. From Eq. (10.1b), we have k22z e2m2v2/c2 k2x . When 2e2m2 kxc/v 2e1m1, the incidence angle u1 is defined but the refraction angle is not, because k2z becomes imaginary. One can write k2z i2, where 2 2k2x e2m2v2/c2 is a real positive number. In this case, Zrp Z 1 and rp eid ei2a

(10.6)

where tan a (2/e2)/(k1z /e1). Following Haus,6 the magnetic field at x 0 in medium 1 can be written as Hy 2Hieia cos (k1zz a), z 0

(10.7a)

Similarly, in medium 2, Hy becomes Hy 2Hieia cos (a)e2z, z 0

(10.7b)

The magnetic field at x 0 is plotted in Fig. 10.2b with respect to kz z, at the instance of time when the phase of Hieiavt becomes zero. From this figure, one can see that the field decays exponentially in medium 2. As a result of total internal reflection, there is a phase shift in medium 1 so that the maximum field is shifted from the interface to kz z a. The phase angle of the reflection coefficient d 2a is called the Goos-Hänchen phase shift, which depends on the incidence angle u1 or kx. The difference in d for TE and TM waves in a dielectric prism was used to construct a polarizer called Fresnel’s rhomb, which can change a linearly polarized wave to a circularly polarized wave, or vice versa.7

NEAR-FIELD ENERGY TRANSFER

381

Calculate the time-averaged Poynting vector near the interface in the case of total internal reflection. Example 10-1.

Based on Example 8-1, it can be seen that the Poynting vector S Re(E) Re(H) is in general a function of time. The time-dependent terms that oscillate with 2v, however, become 1 zero after integration. The time-averaged Poynting vector is kSl 2 Re(E H*). For z 0, 1 * kSz l 2 Re(ExHy ) 0 because k2z is purely imaginary. It can also be shown that kSz l 0 for z 0 (see Problem 10.2). Furthermore, Solution.

kSx l 12Re(EzH*y) µ

kx 2 ve1e0 ZHi Z [1 cos (2k1zz 2a)], z 0 kx 2 22z , z0 ve2e0 ZHi Z [1 cos(2a)]e

(10.8)

Note that kSx l does not have to be continuous at the interface. Depending on whether e is positive or negative, the sign of kSx l may be the same as or opposite to kx. It should also be noted that kSx l is a sinusoidal function of z in medium 1 and decays exponentially in medium 2 as z approaches infinity.

Newton conjectured that, for the total internal reflection of light by the boundary, the beam of light would penetrate some distance into the optically rarer medium and then reenter the optically denser medium. In addition, he suspected that the path of the beam would be a parabola with its vertex in the rarer medium and, consequently, the actual reflected beam would be shifted laterally with respect to the geometric optics prediction. From the Poynting vector formulation given in Eq. (10.8), the energy must penetrate into the second medium to maintain the energy flow parallel to the interface and reenter the first medium so that no net energy is transferred across the interface. The actual beams have a finite extension so that the reflected beam in the far field can be separated from the incident beam since the Poynting vector is parallel to the wavevector. The effect of the parallel energy flow indeed causes the reflected beam to shift forward from that expected by the geometric optics analysis. Goos and Hänchen were the first to observe the lateral beam shift through a cleverly devised experiment [Ann. Physik, 6(1), 333, 1947; 6(5), 251, 1949]. A schematic of this experiment is shown in Fig. 10.3, in

FIGURE 10.3 Illustration of the Goos-Hänchen experiment.

CHAPTER 10

382

which a glass plate was used so that the incident light was multiply reflected by the top and bottom surfaces. In the middle of one or both of the surfaces, a silver strip was deposited. This way, the beam reflected by the silver film (solid line) would essentially follow geometric optics and that by total internal reflection would experience a lateral shift. Although the lateral shift is on the order of the wavelength, a large number of reflections (over 100 times) allowed the shift to be observed by a photographic plate. Lotsch published a series of papers on the comprehensive study of the Goos-Hänchen effect.8 Puri and Birman provided an elegant review of earlier works, including several methods for analyzing the Goos-Hänchen effect.9 A quantitative study of the GoosHänchen effect is presented next. One way to model the lateral shift is to use a beam of finite width rather than an unbounded plane wave. Another method that is mathematically simpler considers the phase change of an incoming wave packet, which is composed of two plane waves with a slightly different kx. Upon total internal reflection, the phase shift d 2a for a given polarization is a function of kx. The difference in the phase shift will cause the reflected beam to exhibit a lateral shift along the interface (x direction) given as D

k21z 22 e1 2kx dd e 2 dkx 2 2k1z k ( e /e )2 1z 2 1 2

(10.9)

where we have used a tan1(2e1/k1ze2). In formulating Eq. (10.9), kx is always taken as positive. Equation (10.8) clearly suggests that kSx l and kx have the same sign when the permittivity is positive and different sign when the permittivity is negative.10 When e1 and e2 have different signs, the lateral shift D will be negative, which implies that the lateral shift is opposite to kSlx of the incident beam. For a TE wave, one can simply replace e’s by m’s in Eq. (10.9). For two dielectrics, we have m1 m2 1, e1 n21, and e2 n22, where n1 and n2 are the refractive indices of medium 1 and 2, respectively. Consequently, Eq. (10.9) reduces to the following: Ds and

Dp

2 tan u1 2

for a TE wave

2 tan u1 2 Qn21 sin 2u1 >n22 cos 2u1 R

for a TM wave

(10.10a) (10.10b)

At grazing incidence, k1z S 0, however, the shift in the direction parallel to the beam is D cos u1 (2/2)(e2/e1) sin u1, which approaches a finite value and does not diverge. At the critical angle, u1 uc sin 1(n2/n1), 2 d 0, and D approaches infinity. This difficulty can be removed by using the Gaussian beam incidence.11 Quantum mechanics has also been applied to predict the lateral beam shift.8 The Goos-Hänchen effect also has its analogy in acoustics and is of contemporary interest in dealing with negative index materials, waveguides, and photon tunneling.10,12,13 10.1.2 Waveguides and Optical Fibers Optical fibers and waveguides are essential for optical communication and optoelectronics. There are numerous other applications such as noncontact radiation thermometry, nearfield microscopy, and decoration lightings. According to a report in 2000, the total length of optical fiber wires that had been installed worldwide exceeded 3.0 1011 m, which equals the distance of a round trip from the earth to the sun. Optical fibers usually operate based on the principle of total internal reflection, as shown in Fig. 10.4. The fiber core is usually surrounded by a cladding material with a lower refractive index.

NEAR-FIELD ENERGY TRANSFER

383

FIGURE 10.4 Schematic of a planar dielectric waveguide.

The numerical aperture NA is defined according to the half angle uh of the acceptance cone, within which total internal reflection occurs. It can be seen from Fig. 10.4 that NA sinuh n1cosuc 2n21 n22

(10.11)

For example, if n1 1.53 and n2 1.46, the critical angle uc 72.6, the maximum cone angle uh 27, and NA 0.46. There are different types of waveguides, such as graded-index waveguides and metallic waveguides, in addition to the simple dielectric type. The cross section may be circular, annular, rectangular, or elliptical. In some cases, the diameter of the fiber is much greater than the wavelength and the electromagnetic waves inside the fiber are incoherent. These devices are sometimes called lightpipes, which are used for relatively short distances. Optical fibers in communication technology use very thin wires and transmit light with well-defined modes. In the following, the configuration of a 1-D dielectric slab between two media will be discussed to illustrate the basics of an optical waveguide. More detailed treatments can be found from the texts of Haus6 and Kong.7 The present author was fortunate to learn optoelectronics and the electromagnetic wave theory through graduate courses taught by these professors. Consider the planar structure shown in Fig. 10.4 that is infinitely extended in the y direction. When the variation of d along the x direction is negligibly small compared to the wavelength, the electromagnetic waves inside the waveguide are coherent. A standing wave pattern must be formed in the z direction. This requires the phase change in the z direction, for the round trip including two reflections at the boundary, to be a multiple of 2p, i.e., 2k1zd 2d 2mp, m 0,1,2,c

(10.12)

where k1z (v/c)n1 cos u1, and the phase shift upon total internal reflection is d 2a 2tan1 ag

2 sin 2u1 sin 2uc b cos u1

(10.13)

where g 1 for TE waves and g n21/n22 for TM waves. The solutions of Eq. (10.12) give discrete values of u1 or kx (v/c)n1 sin u1, at which waves can propagate through the fiber for a prescribed frequency. These are called guided modes of the optical fiber, and Eq. (10.12) may be regarded as the mode equation. The orders of mode are identified as TE0,TE1,c, TEm or TM0,TM1,c, TMm for a 1-D waveguide. For a 2-D waveguide, the subscripts consist of two indices “ml” for each mode. As

CHAPTER 10

384

u1 decreases from p/2 to uc, k1z increases and higher-order modes can be excited. One might wonder why u1 p/2 or kx1 k1 is not a guided mode. In this case, energy would go through the core, cladding, and air in a straight line. Any bending in the waveguide would result in some loss of energy transfer. On the other hand, the guided modes are much less affected by the bending. This is why an optical fiber can transfer signals to a very long distance while being flexible. To illustrate the solution in terms of k1zd , let us rearrange Eq. (10.12) as follows: tan a

k1zd (k1d)2 (k2d)2 2 mp 1 b tan a g g Å 2 2 k1z (k1zd)2

(10.14)

The left and right sides of Eq. (10.14) can be plotted in the same graph against k1zd , as shown in Fig. 10.5, for two values of vd, assuming v2d2 v1d1. The dash-dotted curves

FIGURE 10.5 Solutions of the mode equation, when v 2d 2 v 1d 1. The circles indicate the intersections between the curves described by the left and right sides of Eq. (10.14).

are for TE waves, and the dotted curves are for TM waves. The intersections within the circles identify the guided modes. It is noted that fewer modes are permitted with a smaller vd or d/l. In the graph with v1d1, the possible modes are TE0, TE1, TM0, and TM1 only. A fiber that supports only a single mode for a given frequency is called a single-mode fiber; otherwise, it is called a multimode fiber. Determine the range of d/l so that only the TE0 and TM0 waves are guided in the planar waveguide with n1 1.55 and n2 1.42. Moreover, if d/l 1000, how many TE and TM modes may be guided?

Example 10-2.

Because d/l must be small enough so that the right-hand side of Eq. (10.14) becomes zero at k1zd p, we have (k1d)2 (k2d)2 p2, or 4p2(n21 n22)(d/l)2 p2. Finally, we find d/l 0.5 (n21 n22)1/2 1.3. Moreover, from Fig. 10.5, we can estimate the highest-order mode Solution.

NEAR-FIELD ENERGY TRANSFER

385

M using k1zd Mp and cos u1 cos uc when d W l. Hence, 2p(d/l) cos uc Mp, or M 2(d/l) cos uc 801.8. There will be 802 TE modes and 802 TM modes including the zerothorder modes.

Next, we will study the fields in a planar waveguide. Let us take a TE wave and write in the more general terms e1, m1, e2, and m2. The electric field is nonzero only in the y direction, and the y-component of the electric field is given by Ey • (Ae

ik1zz

Ce2zeikxx, z 0

Beik1zz)eikxx, 0 z d De2(zd)eikxx, z d

(10.15)

where the time-harmonic term exp(ivt) is again omitted for simplicity. The magnetic fields can be obtained as Hx (ivm1m0)1('Ey /'z) and Hz (ivm1 m0)1 ('Ey /'x). There are four boundary conditions for the tangential components to be continuous at z 0 and z d. We end up with a set of homogeneous linear equations of the coefficients A, B, C, and D. The solution exists only when the determinant of the characteristic 4 4 matrix becomes zero and can be expressed in a combined equation as follows: tan (k1zd)a

k21z e21

22 e22

k1z2 b 2a e e b

(10.16)

1 2

This is an equivalent expression of the mode equation. An easier way to solve Eq. (10.15) is by considering the condition of total internal reflection at the boundaries, i.e, A Beid and B Aei(2k1zd d)

(10.17)

i(2k1zd 2d)

The combination gives e 1, which is nothing but Eq. (10.12). After substituting A Bei2a into Eq. (10.15), boundary conditions require that k1zd mp 2z ikxx be e , z 0 2 2 k1zd mp b(k1zz a)eikxx, 0 z d Ey f2eiaBcosak1zz 2 2 k1zd mp 2(zd) ikxx 2eiaBcosa e , zd

be 2 2 2eiaBcosa

(10.18)

Figure 10.6a shows the electric field distribution for TE0, TE1, and TE2. The decaying fields inside the cladding are clearly demonstrated. For a cladding with the conductivity

z=d

z=d TE0 Ey

TE1 Ey

z=0

TE2

TE1

Ey

Ey

TE2

TE3 Ey

Ey

z=0 (a) Dielectric cladding

(b) Conductive cladding

FIGURE 10.6 Electric field distribution E y(z) in planar waveguides. For the conducting cladding, s and the lowest-order TE mode is the first order.

S`

386

CHAPTER 10

s S ` , the waves will be perfectly reflected at the interface without any phase shift and the electric field must vanish in the cladding. Only the odd m’s are guided modes. The first guided mode is TE1, and the guided mode TEq corresponds to q (m 1)/2, with m 1, 3, 5, c The electric fields for the conducting waveguide modes TE1, TE2, and TE3 are shown in Fig. 10.6b for comparison with those for the first three modes in the dielectric waveguide. The difference lies in that no fields can penetrate into the conducting waveguide, whereas the fields can penetrate into the dielectric cladding. Determine the energy flux, phase velocity, and group velocity of the electromagnetic waves in a planar dielectric waveguide.

Example 10-3.

Obviously, there is no net energy flow in the z direction, and kSlx 21Re(EyH*z EzH*y). The second term on the right becomes zero for a TE wave; thus, kSlx (kx /2vm0 )Ey Ey*. Integration of kSlx from z ` to ` gives the power transmitted per unit length in the y direction. Note that a small portion of energy is transmitted through the cladding. The phase velocity along the x direction is vp v/kx c/(n1 sinu1). The group velocity for a given mode is given by vg (dkx /dv)1, which requires the solution of Eq. (10.16) accounting for the frequency-dependent refractive index. Solution.

In Chap. 9, we introduced the concept of Fabry-Perot resonant cavities. Two- and threedimensional optical cavities and microwave cavities support resonance modes, which are standing waves within the cavity. These devices are important for photonics and optoelectronics. Microcavities have also been used to modify the surface radiative properties. The quality factor, or the Q-factor, of a resonator is defined as the ratio of energy storage to the energy dissipation. High Q-factors can be achieved with the microfabricated microcavities for quantum electrodynamics (QED), enhancement and suppression of spontaneous emission, and biological and chemical sensing.14 A special microcavity is made of spheres or disks, where the resonance is built up around a circumference in the form of a polygon. Total internal reflection traps the light inside the microsphere or the disk. At a particular wavelength, when resonance occurs, light undergoes multiple reflections, and a strong electric field which is confined near the perimeter can be built. This is the so-called whispering gallery mode (WGM), named after the whispering gallery at St. Paul’s Cathedral in London. A whispering gallery is a circular gallery under a dome where whispers can be heard from the opposite side of the building. Optical fibers or waveguides are commonly used to couple the photon energy to or from the microcavities via evanescent waves. Ultrahigh Q-factors can be achieved with WGMs. The energy coupling mechanisms have recently been studied by Guo and Quan using a finite-element method.15 A recent development in fiber optics is the use of photonic crystals (PCs) to confine the light into a fiber, whose cladding region is made of PCs, rather than a solid low-index material. The fiber core may be either solid or hollow, and the PCs in the cladding region may contain air-filled holes in silica. For this reason, these fibers are called photonic crystal fibers (PCFs), and some are called holey fibers.16 In the stop band, waves cannot propagate inside the PC and thus effectively confine the propagating wave to the core region, where the modes can be guided, without using total internal reflection. One of the advantages of PCFs over conventional optical fibers is the spectral broadening that enables high-intensity pulses to be transmitted with less distortion or loss of the spectral information, which have important applications such as optical coherence spectroscopy and tomography. Another advantage is that the use of large guiding areas can provide low-loss high-power delivery for imaging, lithography, and astronomy. Other potential applications range from birefringence and nonlinear optics to atomic particle guidance.16 10.1.3 Photon Tunneling by Coupled Evanescent Waves In the preceding sections, we clearly demonstrated that an evanescent wave exists inside the optically rarer medium, which can be air or vacuum, and decays exponentially away

NEAR-FIELD ENERGY TRANSFER

387

from the surface. Furthermore, the evanescent wave or field does not carry energy in the direction normal to the interface. On the other hand, if another optically denser medium is brought to close proximity of the first medium, as shown in Fig. 10.7, energy can be 1 q1

B

Medium 1 Propagating waves

r

A

Medium 2 Evanescent waves

d z

t

Medium 3 Propagating wave

0.8 Transmittance, T'l

1

q1 = 10° q1 = 40°

0.6 0.4

q1 = 50°

0.2 0

n2 = 1, qc = 45°

q1 = 80° 0

0.4

n1 = n3 = 1.414

0.8

1.2

1.6

d/l (a) Tunneling setup

(b) Transmittance for TE waves

FIGURE 10.7 Illustration of photon tunneling. (a) Schematic drawing of the three layers and fields. (b) Calculated transmittance for a TE wave, assuming n 1 n 3 1.414 and n 2 1 . Note the distinct differences between the interference effect and the photon tunneling phenomenon, where the transmittance decreases with increasing d and becomes negligibly small for d l .

transmitted from the first to the third medium, even though the angle of incidence is greater than the critical angle. This phenomenon, known as frustrated total internal reflection, photon tunneling, or radiation tunneling, is very important for energy transfer between two bodies when the distance of separation is shorter than the dominant wavelength of the emitting source. Frustrated total internal reflection has been known since Newton’s time and was theoretically investigated by Hall (Phys. Rev. Ser. I, 15, 73, 1902). Cryogenic insulation is a practical example when photon tunneling may be significant.17 Advances in micro/nanotechnologies have made it possible for the energy transfer by photon tunneling to be appreciable and even dominant at room temperature or above. This may have applications ranging from microscale thermophotovoltaic devices to nanothermal processing and nanoelectronics thermal management.18–20 While photon tunneling is analogous to electron tunneling, through a potential barrier, which may be explained by quantum mechanics, it can be understood by the coupling of two oppositely decaying evanescent waves.21 Because of the second interface, a backwarddecaying evanescent wave is formed inside layer 2, the optical rarer medium. The Poynting vector of the coupled evanescent fields has a nonzero normal component, suggesting that the energy transmission between the media is possible as long as the gap width is smaller than the wavelength. Beyond this wavelength, the field strength of the forward-decaying evanescent wave is too low when it reaches the second interface and the reflected evanescent field is negligible. The matrix formulation discussed in Chap. 9 can be used to calculate the transmittance and the reflectance through the gap (i.e., medium 2) as if there were propagating waves. To illustrate this, consider all three layers are dielectric. Taking the TM wave incidence as an example, let us write the magnetic field inside medium 2 as follows: Hy(x, z) (Aeik2zz Beik2zz)eikxx, 0 z d

(10.19)

where A and B are determined by the incident field and boundary conditions. When two waves are combined, the Poynting vector of the field kSl 12Re[(E1 E2) (H*1 H*2)] has four terms. Two of them can be associated with the power flux of each individual wave,

CHAPTER 10

388

while the other two represent the interaction between the waves. After simplification, the normal component of the Poynting vector can be expressed as kSz l and

k2z Q ZAZ 2 ZBZ 2 R , when k22z k22 k2x 0 2ve2e0

2 kSz l ve e Im(AB*), when

22 k22z k2x k22 0

(10.20a) (10.20b)

2 0

Because there is no loss or absorption, kSz l is independent of z in medium 2, and the ratio of kSz l in medium 2 to that of the incidence in medium 1 is the transmittance. When propagating waves exist in medium 2 or the angle of incidence is smaller than the critical angle, interference will occur and the energy flux in the z direction can be represented by the forward- and backward-propagating waves, see Eq. (10.20a). The transmittance oscillates as the thickness of medium 2 is increased. When evanescent waves exist in medium 2 at incidence angles greater than the critical angle, the transmittance is a decaying function of the thickness of medium 2, as shown in Fig. 10.7b. While the individual evanescent wave does not carry energy, the coupling results in energy transfer, as suggested by Eq. (10.20b). Equation (9.8) through Eq. (9.10), derived in the previous chapter, can be used to calculate the transmittance and the reflectance. These equations are applicable to arbitrary electric and magnetic properties as long as the medium is isotropic and homogeneous within each layer. The phase shift b in these equations is purely imaginary when medium 2 is a dielectric. Assuming that the incident field has an amplitude of 1, determine A and B in Eq. (10.19) for u1 uc sin 1(n2/n1), when all three media are dielectric with n3 n1 n2. Find an expression of the tunneling transmittance using real variables only.

Example 10-4.

The tangential fields can be written as follows for the three-layer structure shown in Fig. 10.7a. Note that 2 2k2x k22 (2pn1 >l) 2 sin 2u1 sin 2uc. Solution.

(eik1zz reik1zz)eikxx, z 0 Hy • (Ae2z Be2z)eikxx, 0 z d teik1zzeikxx, z d k1z (eik1zz reik1zz)eikxx, z 0 vn21e0 i2 Ex f 2 (Ae2z Be2z)eikxx, 0 z d vn2e0 k1z teik1zzeikxx, z d vn21e0

(10.21)

(10.22)

The continuity of tangential fields at the two interfaces allow us to determine t, r, A and B. Note that because the incident field has an amplitude of 1, the preceding equations do not yield a set of homogeneous linear equations as in the case of guided waves. If we use Eq. (10.6) for rp eid, where d 2a and cot (a) (k1z /n21)/(2/n22) for a TM wave, we can rewrite Eq. (9.7) and Eq. (9.8) to obtain the reflection and transmission coefficients as follows:

r

eid(1 e22d) 1 e2ide22d

(10.23)

t

(1 e2id)e2d 1 e2ide22d

(10.24)

NEAR-FIELD ENERGY TRANSFER

389

where we have used the relationship of Fresnel’s coefficients and set the phase shift in Eq. (9.6) to b i2d. After matching the boundary conditions at z d , we have A 0.5t[1 i cot(a)] and

B 0.5t[1 i cot(a)]e2d

(10.25)

It can be shown that the normal component of the Poynting vector is the same in media 2 and 3 (see Problem 10.8). The tunneling transmittance becomes Trl tt* or

2[1 cos (2d)]e22d 1 e42d 2 cos (2d)e22d

Tr l

sin2(d) sin (d) sinh(2d) 2

(10.26a) (10.26b)

Clearly, the tunneling transmittance does not oscillate as d increases; rather, it decreases monotonically from 1 to 0 as d is increased from 0 to infinity. Equation (10.23) through Eq. (10.25) can be applied to TE waves by taking cot (a) k1z /2, which changes the Fresnel reflection coefficient rp to rs because only the dielectric media are considered here. Equation (10.26) is convenient for calculating the tunneling transmittance between dielectrics.

10.1.4 Thermal Energy Transfer between Closely Spaced Dielectrics Energy exchange between closely spaced dielectric plates can be calculated by integrating Planck’s function over all wavelengths as well as over the whole hemisphere using the directional-spectral transmittance. Let us use an example to illustrate the procedure and the effect of photon tunneling and interferences on the near-field thermal radiation. Calculate the hemispherical transmittance between two dielectrics of n1 n3 3, separated by a vacuum gap d (n2 1). Use the results to calculate the radiative energy transfer between the two media, assuming T1 1000 K and T3 300 K.

Example 10-5.

In the far field, we can use the following formula discussed in Chap. 2 (see Example 2-6) to calculate the net radiative heat flux:

Analysis.

sSBT14 sSBT34 1/e1 1/e3 1

qs13,d S `

(10.27)

The hemispherical emissivity of each surface can be evaluated using Eq. (8.86), which can be rewritten as follows, considering that the emissivity is independent of the azimuthal angle f: el,h 2 3

p/2

0

erl(u) cos u sin u du

(10.28)

One could average the directional-spectral emissivity over the two polarizations. However, the preferable way is to calculate the hemispherical emissivity for each polarization and use it to calculate the net heat flux by taking half of Eq. (10.27). The heat fluxes calculated for the two polarizations can then be added to obtain the total heat flux. The results give the far-field limit, which is always smaller than qs13, BB sSB(T41 T43), which is the net radiative heat flux between two blackbodies. This will not be the case in the near field when interference and tunneling effects are important. The hemispherical transmittance can be evaluated in the similar way by integration over the hemisphere. Note that only a small cone of radiation, originated from medium 1, will result in propagating waves in medium 2. This half cone angle is the critical angle, which is uc sin 1(n2 /n1) < 19.5. Thus, we can divide the hemispherical transmittance in two parts to separately evaluate the transmittance. Keeping in mind that the transmittance is defined as the ratio of the transmitted energy to the incident energy, we can sum the two parts to obtain the hemispherical transmittance Solution.

CHAPTER 10

390

(10.29)

Tl, h Tl, prop Tl, evan uc

where

Tl,prop 2 3 Trlcosusinudu

(10.29a)

Tl,evan 2 3 Trlcosusinudu u

(10.29b)

0

p/2

and

c

If n1 2 n3, uc will depend on whether the incidence is from medium 1 or 3, and the resulting hemispherical transmittance will be the same. We can obtain the average transmittance for the two polarizations, as shown in Fig. 10.8a. The propagating wave contribution shows some oscillations but reaches a constant value when d/l S 0 where all waves will be constructively added. At d/1, the constructive and destructive interferences cancel out so that Tl,prop become a constant again. The

106

0.8 0.6

Tl, h Tl, evan

q"13,comb

n1 = 3 n2 = 1

q"13(W/m2)

Transmittance, Tl

1

n3 = 3

0.4 0.2 0 10−3

(a)

d

T3 = 300 K

q"13,evan 105

Tl, prop

10−2

T1 = 1000 K

q"13,prop

10−1 d/l

100

10 −2 10

101

(b)

100

10−1

101

d (µm)

FIGURE 10.8 Radiation heat transfer between dielectric surfaces in close proximity. (a) Contributions to hemispherical transmittance by interference and tunneling, where the transmittance is the average of both polarizations. (b) Net heat flux as a function of the distance of separation.

contribution of evanescent waves becomes important when d/l 1 and starts to dominate over that of the propagating waves when d/ 1. When d/l S 0, the evanescent wave or tunneling contributes to nearly 90% of the transmittance when n1 3. This explains why photon tunneling is very important for the near-field energy transfer. Planck’s blackbody distribution function, given by Eq. (8.44), can be rewritten for each polarization in media 1 and 3, respectively, as eb,l(l,T1)

and

eb,l(l,T3)

n21C1 5

C2/lT1

2l (e

1)

n23C1 2l5(eC2/lT3 1)

(10.30a)

(10.30b)

where in m is the wavelength in vacuum, and C1 3.742 108 W # m4/m2 and C2 1.439 104 m # K are the first and second radiation constants in vacuum. The emissive power in a nondispersive dielectric is increased by a factor of the square of the refractive index, as a result of the increased photon density of states. The factor 2 in the denominator is included because only single polarization has been considered. The net radiation heat flux from medium 1 to 3 is

NEAR-FIELD ENERGY TRANSFER

391

`

qs1S3 3 eb,l(l,T1)Tl,h(l)dl 0

(10.31a)

and that from medium 3 to 1 is `

qs3S1 3 eb,l(l,T3)Tl,h(l)dl 0

(10.31b)

where Tl,h is obtained from Eq. (10.29). Hence, the net radiation heat transfer becomes qs13 qs1S3 qs3S1

(10.32)

One can also separately substitute the hemispherical transmittance of propagating and evanescent waves to Eq. (10.31). Equation (10.32) should be individually applied to TE and TM waves, and then summed together to get the net heat flux. The integration limits can be set such that the lower limit lL 0.1lmp and the upper limit lH 10lmp, where lmp is the wavelength corresponding to the maximum blackbody emissive power at the temperature of the higher-temperature medium as expressed in Eq. (8.45). The calculated results of the near-field radiative transfer are shown in Fig. 10.8b as a function of the separation distance d. Several important observations can be made. 4 4 (a) When d V lmp, the propagating waves result in qs 13,prop sSB(T1 sT3 ) and the evanescent 2 4 4 waves result in qs 13,evan (n1 1)sSB(T1 T 3 ). The combined net radiation heat transfer is 2 4 4 qs 13,comb n1sSB(T1 T3 ). (b) As the distance increases, the evanescent wave contribution goes down monotonically and becomes negligible when d lmp, which is about 3 m. (c) Due to interference effects, the energy transfer by propagating waves decreases slightly as d increases and then reaches the far-field limit, Eq. (10.27), when d lmp.

If the media were conductive, the previous calculations are not appropriate because of the large imaginary part of the refractive index or the dielectric function. In fact, the nearfield radiation heat transfer can be greatly enhanced with the presence of surface waves or if the media are semiconductors.18–20 The treatment requires the knowledge of fluctuational electrodynamics, which will be discussed in Sec. 10.5 at length.

10.1.5 Resonance Tunneling through Periodic Dielectric Layers There exists a photonic analogue of resonance tunneling of electrons in double-barrier quantum well structures. The geometry to illustrate resonance photon tunneling is depicted in Fig. 10.9a, with periodic layers of thicknesses a and b, like the photonic crystal (PC) structure discussed in Sec. 9.3, with a period a b. For tunneling to occur, the doubleprism structure can be used so that light is incident from medium 1 with a refractive index n1. The barrier of thickness b is made of another dielectric with a refractive index n2 that is lower than n1. There are N periods or unit cells in total between the end media. Light is incident at an incidence angle u1 uc sin 1(n2/n1). Yeh performed a detailed analysis of this phenomenon and derived the equation of transmittance,22 which can be expressed as Tlr

1 sinh2(b) sin2(NKA) 1

sin2(d) sin2(K)

(10.33)

where K is the Bloch wavevector of the PC, d is the phase angle upon total internal reflection, and is the imaginary part of the normal component of the wavevector in the lower-index dielectric, as defined in Example 10-4. It can be seen that Eq. (10.33) reduces to Eq. (10.26a) and Eq. (10.26b) for N 1, where the transmittance is 1 at b 0, and decreases monotonically with increasing b.

CHAPTER 10

392

z q1 n1 n2

k

x

1

n1 n2 n1

a b

2

N=2

Transmittance, T 'l

1 0.8

N=5

N=5 0.6 0.4 0.2 0

Λ

0.9606 0.9612 0.9618 0.9624 0.9630 0.9636 λ/b

n2 n1

Transmittance, T'l

n1 N

0.8

N=2

1 N=5

N= 5

0.6 0.4 0.2 0 1.70

1.74

1.78

1.82

l/b

(a)

(b)

FIGURE 10.9 Resonance tunneling. (a) Alternative high-index (n1) and low-index (n 2) multiple dielectric layers for resonance tunneling. (b) Calculated transmittance spectra for N 2 and N 5, at two wavelength regions. Calculation conditions are n 1 3 , n 2 2 , a b/2 , and u 1 45.

The following equation can be used to calculate K: cos(K) cos(k1za)cosh(b) cot(d)sin(k1za)sinh(b)

(10.34)

where k1z is the normal component of the wavevector in medium 1. While cos (K) is real, K is in general complex. However, there exist regions or pass bands where Z cos (K)Z 1 so that K is real. The transmittance does not oscillate in the pass bands, unlike what was shown in Fig. 9.17, where propagating waves exist in both types of dielectrics. Here, evanescent waves exist in the lower-index dielectric layers. However, the transmittance expressed in Eq. (10.33) becomes unity when the following equation holds: sin (NK) 0 sin (K)

(10.35)

The denominator of this equation simply excludes the zeros in sin (NK) for K mp, m 0, 1, 2,c It turned out that in each pass band, there exist (N 1) solutions, with different combinations of v, kx, and the thicknesses a and b. As an example, Fig. 10.9b illustrated the transmittance as a function of l/b when n1 3, n2 2, u1 45 and a/b 0.5. Because of the narrow transmittance peaks, the plot is broken into two panels, each corresponding to a pass band. For N 2, there is only one peak in each pass band, while for N 5, there are four peaks. Yeh showed that the resonance frequencies correspond to the guided modes in the multilayer-waveguide equations.22 Hence, the fields are highly localized near the higher-index layer. Total internal reflection causes very high

NEAR-FIELD ENERGY TRANSFER

393

reflection on the surfaces of the higher-index layer and produces resonances similar to those in a Fabry-Perot cavity resonator. It should be noted that extremely sharp transmittance peaks can be obtained when l is close to the gap thickness b (see the upper panel). Further investigation on resonance tunneling is needed for the application in narrow band-pass filters. Due to the guided modes and the localized field, the magnitude of the evanescent wave may be amplified in the forward direction in some region (see Problem 10.11). Similar to the lateral shift by total internal reflection, due to the parallel energy flow in the high-index layer (waveguide), there must be a lateral shift of the transmitted light for finite beams. Little has been reported in the literature about the beam shift and the field distribution in dielectric multilayer structures, when resonance tunneling occurs. 10.1.6 Photon Tunneling with Negative Index Materials Negative index materials (NIMs), for which the permittivity and the permeability become negative simultaneously in a given frequency region, can also be used to enhance photon tunneling.23 The basics of NIMs has already been presented in Sec. 8.4.6. The structure is illustrated in Fig. 10.10a with a pair of layers in between two prisms. One of the layers has

FIGURE 10.10 Photon tunneling with a layer of NIM. (a) The tunneling arrangement. (b) The field distribution in the middle layers for a TE wave.

a negative refractive index. Assume that one of the layers is vacuum and another has e m 1, so its refractive index is exactly 1. The transmittance becomes unity when the thickness of the NIM layer and that of the vacuum are the same, regardless of the angle of incidence and polarization. Let us use the full notation of e and m without using the refractive index. The transmission coefficient can be expressed as follows:23 t

8 j1eif1 j2eif1 j3eif2 j4eif2

(10.36)

Here, the phase angles f1 and f2 can be expressed as f1 k2zd2 k3zd3

and

f2 k2zd2 k3zd3

(10.37)

CHAPTER 10

394

where d2 and d3 are the thicknesses of layers 2 and 3, and k2z and k3z are the normal component of the wavevector in media 2 and 3, respectively. Note that when tunneling occurs, k2z and k3z become purely imaginary for the lossless case, as will be discussed later. For a TE wave, the coefficients in Eq. (10.36) are

and

j1 a1

k3zm2 k4zm3 k2zm1 b a1

b a1

b k1zm2 k2zm3 k3zm4

(10.38a)

j2 a1

k2zm1 k3zm2 k4zm3 b a1

b a1 b k1zm2 k2zm3 k3zm4

(10.38b)

j3 a1

k2zm1 k3zm2 k4zm3 b a1 b a1 b k1zm2 k2zm3 k3zm4

(10.38c)

j4 a1

k3zm2 k4zm3 k2zm1 b a1 b a1

b k1zm2 k2zm3 k3zm4

(10.38d)

For a TM wave, the transmission coefficient is defined based on the magnetic fields and the coefficients can be easily obtained by substituting e’s for m’s in Eq. (10.38). The sign selection of klz was mentioned in Sec. 9.2.2 in the discussion of the matrix formulation. Basically, when there exist propagating waves in medium l, klz (2pnl /l) 21 (n1/nl)2 sin 2u1, whose sign becomes negative in a NIM. On the other hand, if the waves become evanescent in medium l, we use klz i(2p/l) 2n21 sin 2u1 n2l il. Here, l is always positive in a lossless medium, even in a NIM. Assume that the prisms are made of the same materials so that properties of medium 1 and medium 4 are identical. Furthermore, layer 2 is made of a NIM with index-matching conditions, i.e., e2 e3 and m2 m3 so that n2 n3. Eq. (10.36) can be further simplified. For propagating waves in the middle layers, k2z k3z and j3 j4 0; thus, t

1 cos (k3z ) iY sin (k3z )

(10.39)

where d3 d2, Y 12 (k3z m1/k1z m3 k1z m3/k3z m1) for TE waves, and Y 21 (k3ze1/ k1ze3 k1ze3/k3ze1) for TM waves. Because media 1 and 4 are made of the same material, the transmittance for propagating waves can be written as follows: Trl

1 cos 2(k3z ) Y2 sin 2(k3z )

(10.40)

For evanescent waves, we have k2z k3z i3, where 3 (2p/l) 2n21 sin 2u1 n23. Now that j1 j2 0, Eq. (10.36) can be simplified so that t

1 cosh(3 ) i cot(d)sinh(3 )

(10.41)

where cot(d) 21(3 m1/k1z m3 k1z m3/3 m1), with d being the phase change upon total internal reflection from medium 1 and 2. The transmittance Trl tt* is real and always decreases with increasing , the difference between the layer thicknesses. Although Eq. (10.39) and Eq. (10.41) are identical because sin(ix) i sinh(x) and cos(ix) cosh(x), the use of real

NEAR-FIELD ENERGY TRANSFER

395

variables allows us to observe the variation of transmittance with easily. When tunneling occurs, the field is highly localized near the interface between the NIM and the PIM layers, as shown in Fig. 10.10b for a TE wave, where the fields are sum of the forwarddecaying and backward-decaying evanescent waves. The amplitude of the evanescent wave in the NIM increases in the direction of energy flow. It can be shown that the amplitude will still increase in medium 2, even though the NIM is placed in layer 3 and layer 2 is a vacuum. This corresponds to another resonance effect, which is associated with the excitation of surface electromagnetic waves or surface polaritons, to be discussed in the next section. The directional and hemispherical transmittances for the structure shown in Fig. 10.10a are illustrated in Fig. 10.11 with the following parameters: n1 n4 1.5, n2 1 1.0

1.0

(a)

0.6 0.4

0.8

n = n = 1.5 1

4

TE, 4 5

o

TE, 6 0

o

n = −1

TM, 45

o

n =1

TM, 60

o

Transmittance, T'l

Transmittance, T 'l

(b)

q1:

0.8

2

3

0.2 0.0

0.6 0.4 Tl, prop

0.2

−1.5

−1

−0.5

0 0.5 (d3−d2)/l

1

1.5

Tl, h

d2/l = 1

0.0

0

0.5

Tl, evan

1 d3/l

1.5

2

FIGURE 10.11 Transmittance for a four-layer structure with one middle layer being matching-index NIM. (a) Directional transmittance. (b) Hemispherical transmittance.

(e2 m2 1), and n3 1 (vacuum). Both the directional and hemispherical transmittances become 1 when d3 d2. The hemispherical transmittance has two components due to propagating and evanescent waves. The effects of loss and dispersion have also been examined.24

10.2 POLARITONS OR ELECTROMAGNETIC SURFACE WAVES Surface plasmons, also known as surface plasmon polaritons, play an important role in near-field microscopy, nanophotonics, and biomolecular sensor applications.25–27 Surface plasmon polaritons represent the interaction between electromagnetic waves and the oscillatory movement of free charges near the surface of metallic materials. When surface plasmons are confined to small structures, such as the tip of a scanning microscopic probe, quantum dots or nanoparticles, nanowires, or nanoapertures, they are referred to as localized plasmons. Surface plasmons usually occur in the electromagnetic wave spectrum in the visible or near-infrared region for highly conductive metals such as Ag, Al, and Au. In some polar dielectric materials, phonons or bound charges can also interact with the electromagnetic waves in the mid-infrared spectral region and cause resonance effects near the surface; these are called surface phonon polaritons, which have applications in tuning the thermal emission properties28 and nanoscale nondestructive imaging29 In the following, the basic mechanisms of surface polaritons will be presented, with discussions on some

CHAPTER 10

396

important applications. Emphasis is placed on the quantitative analysis of radiative properties for layered structures. In Sec. 10.4, the superlens concept will be introduced for imaging beyond the diffraction limit, and the energy streamline method will be presented for analyzing the energy propagation direction in the near-field regime. 10.2.1 Surface Plasmon and Phonon Polaritons Plasmons are quasiparticles associated with oscillations of plasma, which is a collection of charged particles such as electrons in a metal or semiconductor. Plasmons are longitudinal excitations that can occur either in the bulk or at the interface. As shown in Fig. 10.12a, the

Hy µ eh1z Air

e1

Hy

e2

Hy

Metal

µ e−h z 2

z (a) Schematic of surface plasmon

(b) The decaying fields

FIGURE 10.12 Illustration of surface plasmon polariton. (a) Charge fluctuations and the magnetic field at the interface between a metal and air. (b) The exponentially decaying field amplitudes away from the interface.

charges oscillate along the surface, and such an excitation is called a surface plasmon or surface plasmon polariton. The field associated with a plasmon is localized at the surface, and the amplitude decays away from the interface, as shown in Fig. 10.12b. Such a wave propagates along the surface, and therefore, it is called a surface electromagnetic wave, similar to surface waves in fluids or the acoustic surface waves. Surface plasmons can be excited by electromagnetic waves and are important for the study of optical properties of metallic materials, especially near the plasma frequency, which usually lies in the ultraviolet. The requirement of evanescent waves on both sides of the interface prohibits the coupling of propagating waves in air to the surface plasmons. For this reason, surface waves are often regarded as nonradiative modes. The attenuated total reflectance (ATR) arrangements are commonly used to excite surface plasmons. When light is incident from the prism, it is possible for evanescent waves to occur simultaneously in the underneath metallic and air layers, as shown in Fig. 10.13, for the two typical configurations named after A. Otto (prism-air-metal) and E. Kretschmann and H. Raether (prism-metal-air). A detailed discussion with historical aspects can be found from Raether.30 In addition to the requirement of evanescent waves on both sides of the interface, the polariton dispersion relations must be satisfied. They are expressed as follows when both media extend to infinity in the z direction: k2z k1z e1 e2 0 for TM wave

(10.42)

k1z k2z m1 m2 0 for TE wave

(10.43)

NEAR-FIELD ENERGY TRANSFER

q ed e1

k1 kx

e2

397

q Prism

ed

Air

e1

Metal

e2

(a) Otto configuration

k1 kx

Prism Metal Air

(b) Kretschmann configuration

FIGURE 10.13 Typical configurations for coupling electromagnetic waves with surface polaritons using attenuated total reflectance arrangements. (a) The Otto configuration (prism-air-metal). (b) The Kretschmann-Raether configuration (prism-metal-air). Note that a polar dielectric may substitute for the metal to excite surface phonon polaritons.

Let us consider lossless media first. In order for evanescent waves to occur, we must have k1z i1 and k2z i2 with 1 and 2 being positive, in order for the field eikxxik1zz eikxx 1z to decay toward z ` and eikxx ik2zz eikxx2z to decay toward z ` . This means that the sign of permittivity must be opposite for media 1 and 2 in order to couple a surface polariton with a TM wave. On the other hand, we will need a magnetic material with negative permeability for a TE wave to be able to couple with a surface polariton. NIMs exhibit simultaneously negative permittivity and permeability in the same frequency region and are sometimes called double-negative (DNG) materials. Therefore, both TE and TM waves may excite surface plasmon polaritons with a NIM, as predicted by Ruppin.31 When compared with Fresnel’s reflection coefficients, as can be seen from Eq. (10.2), the condition for the excitation of surface polaritons is that the denominator of the reflection coefficient be zero. A pole in the reflection coefficient is an indication of a resonance. Very often, the surface plasmon polariton is referred in the literature as a surface plasmon resonance. Taking a TM wave for example, since k21z m1e1v2/c2 k2x and k22z m2e2v2/c2 k2x from Eq. (10.1), we can solve Eq. (10.42) to obtain v m1/e1 m2/e2 kx c Å 1/e21 1/e22

(10.44)

Equation (10.44) relates the frequency with the parallel component of the wavevector and is another form of the polariton dispersion relation. It should be noted that solutions of this equation are for both k1z/e1 k2z/e2 0 and k1z/e1 k2z/e2 0, i.e., not only the poles but also the zeros of the Fresnel reflection coefficient are included. For nonmagnetic materials, Eq. (10.44) becomes e1e2 v kx c Å e1 e2

(10.45)

One should bear in mind that the permittivities are in general functions of the frequency. For a metal with a negative real permittivity, the normal component of the wavevector is purely imaginary for any real kx because mev2/c2 0. Thus, evanescent waves exist in metals regardless of the angle of incidence.

CHAPTER 10

398

Consider either the Otto or Kretschmann-Raether configuration, and use the three-layer structure with a middle layer, medium 1, of thickness d. From Eq. (9.7), the reflection coefficient can be expressed as follows: r

r01 r12e2ib 2ib

1 r01r12e

r01 r12e21d

(10.46)

1 r01r12e21d

where the subscript 0 signifies the incidence medium, which is the prism, and b k1zd i1d. When d is sufficiently large, exp(21d ) V 1, and the reflectance Rrl rr* < r01r*01 is close to unity. When surface polaritons are excited, however, r12 increases dramatically and thus it is possible for r12e21d to be of the same magnitude as r01, but with an opposite phase, i.e., with a phase difference of p. At the condition of surface plasmon resonance, the reflectance Rlr drops suddenly. Let us use an example to illustrate the polariton dispersion curves and the effect on the reflectance in ATR arrangements. Calculate the dispersion relation between Al and air. Calculate the reflectance versus angle of incidence for both the Otto and Kretschmann-Raether configurations at l 500 nm, using Al as the metallic material. Determine the polariton propagation length at the wavelength l 500 nm. Assume the prism is made of KBr with ed 2.46 and the dielectric function of Al can be described by the Drude model. Example 10-6.

The Drude model parameters for Al have been given in Example 8-6. Thus, we have e2(v) 1 v2p/(v2 ivg), where the plasma frequency vp 2.4 1016 rad/s and the scattering rate g 1.4 1014 rad/s. One way to calculate the dispersion relation is to assume v is real and calculate kx(v) krx(v) iksx(v). The dispersion curves between Al and air (eair 1) are usually plotted in a v-kx graph, for the real part of kx shown in Fig. 10.14a by the solid line. At very low frequencies, the magnitude of e2 is so large that kx < v/c. Note that the dash-dotted line with Solution.

2 10 5

1

1.5 10 5

0.8

kx = w/c

wp

Reflectance, R'l

Frequency, w/(2pc) (cm−1)

(a)

1 10 5

5 10 4

kx = (ed) 1/2 w /c

0.6

0

5 10 4

1 10 5

1.5 10 5

Wavevector component, kx /2p (cm−1)

2 10 5

d = 24 nm (Al)

d = 900 nm (Air)

0.4

l = 500 nm 0.2

0

d = 16 nm (Al) d = 12 00 nm (Air)

0 38

ed = 2.46

(b) 39

e 40

Al

= 39. 5 + 1. 5i

41

42

43

Incidence angle, q (deg)

FIGURE 10.14 (a) The dispersion relation of surface plasmon polaritons between Al and air, where kx is the real part solution of Eq. (10.45). (b) Reflectance in ATR arrangements, either with Al or air as the middle layer.

kx v/c represents the light line. On the left of this line, there exist propagating waves in air; whereas on the right of the light line, evanescent waves occur in air because kx v/c. The light line can be considered as a wave travelling in air along the x direction. On the polariton dispersion curve, kx increases quickly as v increases and reaches an asymptote at v vp/!2, when the real part of the dielectric function of Al approaches –1. Between vp/!2 v vp, the real part of the dielectric function of Al becomes negative with an absolute value less than 1. Therefore, the solution of Eq. (10.45) has a large imaginary part, while the real part of kx drops to near zero, as reflected by the

NEAR-FIELD ENERGY TRANSFER

399

bending of the dispersion curve toward left and the steep rise upward. Beyond v vp, metal becomes transparent and the real part of the dielectric function becomes positive. Solutions beyond v vp correspond to zeros in the reflection coefficient and thus are not the solutions for Eq. (10.42), which are poles of the reflection coefficient. Notice that the dotted line refers to the light line of the prism. In the shaded region, there exist evanescent waves in air but propagating waves appear in the prism; as a result, surface plasmons can be coupled to propagating waves in the prism. The reflectance is calculated from Eq. (10.46) at the wavelength l 500 nm, corresponding to a wavenumber of 20,000 cm1. As can be seen from Fig. 10.14a, at this frequency, the surface polariton curve is very close to the light line in air. Therefore, the excitation of surface polariton is expected to be near the critical angle uc < 39.6 between the prism and air. The reflectance would be close to 1 at u uc. However, as shown in Fig. 10.14b, the reflectance drops suddenly around 40 due to the excitation of surface polaritons. Furthermore, the reflectance dips are very sensitive to the thickness of the middle layer. In the Otto configuration, the air thickness of 900 nm yields a sharp dip. For the Kretschmann-Raether configuration, on the other hand, a metallic film thickness of 24 nm yields a sharp dip in the reflectance. If the Al film exceeds 50 nm, the reflectance is close to 1. The locations of the reflectance minimum and the width depend on the thickness of the middle layer. When the surface plasmon polariton is excited, a large absorption occurs in the metal, which results in a coupling of the electromagnetic energy to a surface wave. The propagation length of the surface wave can be determined based on the imaginary part of kx, i.e., ksx . Note that the field can be expressed as eikrx xksx x for surface waves propagating in the positive x direction and as eikrx x ksx x for surface waves propagating in the negative x direction. The power is proportional to the square of the field amplitude, and the (1/e) power decaying length or the polariton propagation length is30

(10.47)

lsp 1/(2ksx )

Plugging into the values in Eq. (10.45), we obtain lsp < 80 m. Note that the Drude model somewhat underpredicts the imaginary part of the dielectric function. If Im(e) of Al were taken as 10 at l 500 nm, one would obtain lsp < 13 m, still much longer than the wavelength.

Another way to excite surface plasmon or phonon polaritons is by gratings. When light is incident onto a grating at a given kx, the Bloch-Floquet condition given by Eq. (9.63a) in Sec. 9.4 states that the reflected and refracted waves can have different values of the parallel component of the wavevector: kx,j kx 2pj/, where j is the diffraction order and is the period of the gratings. For this reason, the dispersion relation can be folded into the region for kx p/ and surface polaritons can be excited on a grating surface. As an example, Fig. 10.15a shows the reduced dispersion relation for a binary grating made of Ag with

2 104

Reflectance, R'l

Wavenumber (cm−1)

1.5 104 Folded dispersion lines 1 104

TM, q = 0° and 30°

1

(a)

q = 30°

0.8 0.6 0.4

q

5 103

0

0.2

Dispersion line

0 100 3

1.5 10

3

3 10

4.5 10

kx /(2p) (cm−1)

3

3

6 10

(b) 0 3000

TM, q = 30°

TM, q = 0°

Ag 6000

9000

Λ = 1.7 µm d = 0.1 µm Filling ratio f = 0.65 12000

Wavenumber (cm−1)

FIGURE 10.15 (a) Dispersion curves for gratings. (b) Reflectance for an Ag grating.

15000

18000

CHAPTER 10

400

1.7 m. The solid lines are the folded dispersion curves, and the dash-dotted lines, which are also folded, correspond to an incidence angle of 30. The intersections identify the location where surface plasmons can be excited for a TM wave incidence, when the magnetic field is parallel to the grooves. The reflectance of a shallow grating on Ag is calculated and plotted in Fig. 10.15b at u 0 and 30. The grating height d 100 nm, and the filling ratio f 0.65 (see Fig. 9.18 for the grating geometry). For a TE wave, no drops exist in the reflectance because surface waves cannot be excited. The reflectance is very high for TE waves and has little difference between u 0 and u 30. For a TM wave, the excitation of surface polaritons is responsible for the dips in the reflectance. Furthermore, the frequency locations agree well with those predicted by the dispersion curves. Note that at normal incidence, the excitation frequencies are located at the intersections between the dispersion curve and the vertical axis, as shown in Fig. 10.15a. These dips have also been known as Wood’s or the Rayleigh-Wood anomalies, when a diffraction order just appears at the grazing angle; see Hessel and Oliner (Appl. Opt., 4, 1275, 1965). The actual resonance frequency may shift slightly from the frequency associated with the appearance or disappearance of a diffraction order, because the dispersion curve is not a straight line. The Rayleigh-Wood anomaly may also occur for gratings whose dielectric functions have a positive real part, i.e., not associated with surface plasmon polaritons. It should be mentioned that many polar dielectric or semiconductor materials such as MgO, SiC, and GaAs contain a phonon absorption band, called the reststrahlen band, where Re(e) is negative and Im(e) is very small. The surface polariton condition described in Eq. (10.42) can be satisfied in the infrared, and the associated excitation or resonance is called a surface phonon polariton. In the following discussion of polaritons, the word “metal” is used to signify a material with a negative real permittivity or a negative-e material. Surface roughness is yet another way to excite surface waves because a rough surface can be considered as a Fourier expansion of multiple periodic components, each acting as a grating. Obviously, there is a large room to tune the radiative properties by surface polaritons with different geometries. The resonance behavior in nanoparticles or quantum dots has enormous applications in chemical sensing and medical diagnoses. Plasmon waveguide which is based on the resonance of nanoparticles, nanowires, and nanotips may allow electromagnetic energy transfer beyond the diffraction limit; see, for example, Maier et al. (Nature Mater., 2, 229, 2003), Dickson and Lyon (J. Phys. Chem. B, 104, 6095, 2000), and Stockman (Phys. Rev. Lett., 93, 137404, 2004). Mie in 1908 developed the formula to describe scattering from small absorbing particles, and expressed the scattering coefficient and the absorption coefficient in the limit of a small sphere, whose radius r0 is much smaller than the wavelength in vacuum l, as

and

Qsca,l

8 2pr0 4 2 2 e2 e1 2 2 b e1 a 3 l e2 2e1

(10.48)

Qabs,l

8pr0 e2 e1 b 2e1Ima l e2 2e1

(10.49)

where e1 is the dielectric function of the surrounding dielectric medium and e2 is that of the absorbing sphere.32 While Eq. (10.48) has the same form as the expression of Rayleigh scattering with the 1/l4 relationship of the scattering cross section, defined as 4pr20Qsca,l, the scattering of metallic spheres is distinctly different from that of dielectric spheres because the dielectric function of metals is complex and depends strongly on the wavelength. The scattering cross section is usually a very complex function of the wavelength. This is especially true when the resonance condition e2 2e1 is satisfied. This resonance

NEAR-FIELD ENERGY TRANSFER

401

is associated with the localized surface plasmon polaritons. Geometric optics completely failed to describe scattering and absorption of small particles. The scattering cross section can be much greater than the actual surface area. Furthermore, the absorbed energy can exceed that of a blackbody of the same size. In fact, the blackbody concept is misleading in the subwavelength regime. The actual resonance condition may be complicated for different geometries and coatings, as well as for clusters of particles or nanoparticle aggregates. Detailed discussion about resonance in metallic and polar dielectric materials in the absorption band can be found from Bohren and Huffman;32 also see Yang et al. (J. Cem. Phys., 102, 869, 1995), Link et al. (J. Phys. Chem. B, 103, 3073, 1999), Jin et al. (Science, 294, 1901, 2001), and Kottmann et al. (Phys. Rev. B, 64, 235402, 2001). Resonance phenomena in small particles have been applied to surface-enhanced Raman scattering microscopy and surface-enhanced florescence microscopy for single-molecule detection. The study of resonance phenomena in small particles continues to be an active research area because of the applications in biological imaging and molecular sensing; for details, refer to Moskovits (Rev. Mod. Phys., 57, 783, 1985), Chen et al. (Nano Lett., 5, 473, 2005), Johansson et al. (Phys. Rev. B, 72, 035427, 2005), and Pustovit and Shahbazyan (Phys. Rev. B, 73, 085408, 2006). Surface wave scattering has been used as a technique to characterize metallic nanoparticles.33

10.2.2 Coupled Surface Polaritons and Bulk Polaritons Polaritons can exist on both surfaces of a thin film, resulting in a standing wave inside the film, as shown in Fig. 10.16. Economou performed a detailed investigation of different

e1, m1

2

1

3

Medium 1

e2, m2

Medium 2

e3, m3

Medium 3

d

FIGURE 10.16 Illustration of polaritons in a slab. 1—symmetric mode coupled surface polaritons; 2—antisymmetric mode coupled surface polaritons; and 3—bulk polariton.

configurations of a thin-film structure;34 while Kovacs and Scott (Phys. Rev. B, 16, 1297, 1977) studied the optical excitation of surface plasma waves in layered structures. An essential requirement for coupled surface polaritons to occur is the existence of evanescent waves that decay in both media 1 and 3. Such a method was used in Sec. 10.1.2 for obtaining the mode equation for waveguides. A more convenient method to derive the polariton relations is to set the denominator of the reflection coefficient to zero. From Eq. (10.46), we can see that for the configuration shown in Fig. 10.16, r (r12 r23e2ik2zd)/(1 r12r23e2ik2zd), which has poles at 1 r12r23e2ik2zd 0. This can be expressed as follows: tanh(ik2zd)a

k22z e22

k1zk3z k2z k1z k3z

e e b e ae e b 1 3 2 1 3

(10.50)

which is the polariton dispersion relation for a slab sandwiched between two semi-infinite media. Because tanh(ik2zd) i tan(k2zd), Eq. (10.50) is identical to the mode equation of

CHAPTER 10

402

a planar waveguide given in Eq. (10.16), when medium 3 i